I've already documented how Parquet is often better than CSV for data storage. It takes up less space, is faster to load, and has advanced compression and sophisticated data access features.
Parquet's biggest drawback has been the inability to readily inspect the data -- i.e. to see and interact with it a bit.
There is now a no-brainer solution that requires just one command line and a few seconds of patience.
Two easy steps to see (and interact with) Parquet in seconds
Clone the Deephaven Parquet viewer repository.
Use the following command to specify (1) the path to the Parquet file and (2) a port to view it.
./deephaven-parquet-viewer.sh ../Desktop/examples/Taxi/parquet/taxi.parquet 8080
Here's the output after a few seconds:
Starting Deephaven Parquet Viewer.........
Ready!
table @ http://localhost:8080/iframe/table/?name=parquet_table
ide @ http://localhost:8080/ide/
Control-C to exit
- Copy and paste either the table (iframe) URL or the full IDE URL into a browser.
I love this little tool. I hope you find it handy.