Import and Export Data
Data I/O is mission-critical for any real-time data analysis platform. Deephaven supports a wide variety of data sources and formats, including CSV, Parquet, Kafka , and more. This document covers those formats in Deephaven.
We will use this table in several of the following examples:
CSV
Deephaven can read and write CSV files to and from local and remote locations. This example writes a table to a local CSV file.
We can show that the file is there by importing the CSV:
Parquet
Apache Parquet is a columnar storage format that supports compression to store more data in less space. Deephaven supports reading and writing single, nested, and partitioned Parquet files. Parquet data can be stored locally or in S3. The example below writes a table to a local Parquet file.
We can show that the file is there by reading it back in:
Kafka
Apache Kafka is a distributed event streaming platform that can be used to publish and subscribe to streams of records. Deephaven can consume and publish to Kafka streams. The code below consumes a stream.
Similarly, this code publishes the data in a Deephaven table to a Kafka stream.
Function-generated tables
Function-generated tables are tables populated by a Groovy function. The function is reevaluated when source tables change or at a regular interval. The following example re-generates data in a table once per second.
Function generated tables, on their own, don't do any data I/O. However, Groovy functions evaluated at a regular interval to create a ticking table are a powerful tool for data ingestion from external sources like WebSockets, databases, and much more.