Skip to main content

2 easy steps to inspect any Parquet file in seconds

· One min read
DALL·E prompt: robot with binoculars bent over and looking straight down at a wood parquet tile made of wood, digital art highly detailed 4k
JJ Brosnan
Break away from CSV altogether

I've already documented how Parquet is often better than CSV for data storage. It takes up less space, is faster to load, and has advanced compression and sophisticated data access features.

Parquet's biggest drawback has been the inability to readily inspect the data -- i.e. to see and interact with it a bit.

There is now a no-brainer solution that requires just one command line and a few seconds of patience.

Two easy steps to see (and interact with) Parquet in seconds

  1. Clone the Deephaven Parquet viewer repository.

  2. Use the following command to specify (1) the path to the Parquet file and (2) a port to view it.

./deephaven-parquet-viewer.sh ../Desktop/examples/Taxi/parquet/taxi.parquet 8080

Here's the output after a few seconds:

Starting Deephaven Parquet Viewer.........
Ready!
table @ http://localhost:8080/iframe/table/?name=parquet_table
ide @ http://localhost:8080/ide/

Control-C to exit
  1. Copy and paste either the table (iframe) URL or the full IDE URL into a browser.

img

I love this little tool. I hope you find it handy.