Skip to main content

writeTable

The writeTable method will write a table to a standard Parquet file.

Syntax

writeTable(table, outputFile)
writeTable(table, outputFile, parquetInstructions)

Parameters

ParameterTypeDescription
tableTable

The table to write to file.

outputFileString

Path name of the file where the table will be stored. The file name should end with the .parquet extension. If the path includes non-existing directories, they are created.

parquetInstructions optionalString

Optional instructions for customizations while writing. Valid values are:

  • LZ4: Compression codec loosely based on the LZ4 compression algorithm, but with an additional undocumented framing scheme. The framing is part of the original Hadoop compression library and was historically copied first in parquet-mr, then emulated with mixed results by parquet-cpp.
  • LZO: Compression codec based on or interoperable with the LZO compression library.
  • GZIP: Compression codec based on the GZIP format (not the closely-related "zlib" or "deflate" formats) defined by RFC 1952.
  • ZSTD: Compression codec with the highest compression ratio based on the Zstandard format defined by RFC 8478.

Returns

A Parquet file located in the specified path.

Examples

note

All examples in this document write data to the /data directory in Deephaven. For more information on this directory and how it relates to your local file system, see Docker data volumes.

Single Parquet file

In this example, writeTable writes the source table to /data/output.parquet.

from deephaven.TableTools import newTable, intCol, stringCol
from deephaven.ParquetTools import writeTable

source = newTable(
stringCol("X", "A", "B", "B", "C", "B", "A", "B", "B", "C"),
intCol("Y",2, 4, 2, 1, 2, 3, 4, 2, 3),
intCol("Z", 55, 76, 20, 4, 230, 50, 73, 137, 214),
)

writeTable(source, "/data/output.parquet")

Compression codec

In this example, writeTable writes the source table /data/output_GZIP.parquet with GZIP compression.

from deephaven.TableTools import newTable, intCol, stringCol
from deephaven.ParquetTools import writeTable

source = newTable(
stringCol("X", "A", "B", "B", "C", "B", "A", "B", "B", "C"),
intCol("Y",2, 4, 2, 1, 2, 3, 4, 2, 3),
intCol("Z", 55, 76, 20, 4, 230, 50, 73, 137, 214),
)

writeTable(source, "/data/output_GZIP.parquet", "GZIP")