writeTable

The writeTable method will write a table to a standard Parquet file.

Syntax

writeTable(sourceTable, destPath)
writeTable(sourceTable, destFile)
writeTable(sourceTable, destFile, definition)
writeTable(sourceTable, destFile, writeInstructions)
writeTable(sourceTable, destPath, definition, writeInstructions)
writeTable(sourceTable, destFile, definition, writeInstructions)

Parameters

Parameter	Type	Description
sourceTable	Table	The table to write to file.
destPath	String	Path name of the file where the table will be stored. The file name should end with the `.parquet` extension. If the path includes non-existing directories, they are created.
destFile	File	Destination file. Its path must end in ".parquet". Any non-existing directories in the path are created. If there is an error, any intermediate directories previously created are removed. Note: this makes this method unsafe for concurrent use.
definition	TableDefinition	Table definition to use (instead of the one implied by the table itself).
writeInstructions	ParquetInstructions	Instructions for customizations while writing. Valid values are: `ParquetTools.SNAPPY`: Aims for high speed, and a reasonable amount of compression, based on Snappy compression format by Google. `ParquetTools.UNCOMPRESSED`: The output will not be compressed. `ParquetTools.LZ4_RAW`: Compression codec loosely based on the LZ4 compression algorithm, but with an additional undocumented framing scheme. The framing is part of the original Hadoop compression library and was historically copied first in parquet-mr, then emulated with mixed results by parquet-cpp. Note that `LZ4` is not recommended for use with Parquet files. Use `LZ4_RAW` instead. `ParquetTools.LZ4`: Deprecated Compression codec loosely based on the LZ4 compression algorithm, but with an additional undocumented framing scheme. The framing is part of the original Hadoop compression library and was historically copied first in parquet-mr, then emulated with mixed results by parquet-cpp. Note that `LZ4` is deprecated; use `LZ4_RAW` instead. `ParquetTools.LZO`: Compression codec based on or interoperable with the LZO compression library. `ParquetTools.GZIP`: Compression codec based on the GZIP format (not the closely-related "zlib" or "deflate" formats) defined by RFC 1952. `ParquetTools.ZSTD`: Compression codec with the highest compression ratio based on the Zstandard format defined by RFC 8478. If not specified, defaults to `SNAPPY`.

Returns

A Parquet file located in the specified path.

Examples

Note

All examples in this document write data to the /data directory in Deephaven. For more information on this directory and how it relates to your local file system, see Docker data volumes.

Single Parquet file

In this example, writeTable writes the source table to /data/output.parquet.

import io.deephaven.parquet.table.ParquetTools

source = newTable(
    stringCol("X", "A", "B",  "B", "C", "B", "A", "B", "B", "C"),
    intCol("Y",2, 4, 2, 1, 2, 3, 4, 2, 3),
    intCol("Z", 55, 76, 20, 4, 230, 50, 73, 137, 214),
)

ParquetTools.writeTable(source, "/data/output.parquet")

Compression codec

In this example, writeTable writes the source table /data/output_GZIP.parquet with GZIP compression.

import io.deephaven.parquet.table.ParquetTools

source = newTable(
    stringCol("X", "A", "B",  "B", "C", "B", "A", "B", "B", "C"),
    intCol("Y",2, 4, 2, 1, 2, 3, 4, 2, 3),
    intCol("Z", 55, 76, 20, 4, 230, 50, 73, 137, 214),
)

ParquetTools.writeTable(source, "/data/output_GZIP.parquet", ParquetTools.GZIP)