Skip to main content
Version: Python

read_csv

The read_csv method will read a CSV file into an in-memory table.

Syntax

from deephaven import read_csv

read_csv(
path: str,
header: Dict[str, dht.DType] = None,
headless: bool = False,
skip_rows: int = 0,
num_rows: int = MAX_LONG,
ignore_empty_lines: bool = False,
allow_missing_columns: bool = False,
ignore_excess_columns: bool = False,
delimiter: str = ",",
quote: str = '"',
ignore_surrounding_spaces: bool = True,
trim: bool = False,
) -> Table:

Parameters

ParameterTypeDescription
pathString

The file to load into a table.

header optionalDict

Define a dictionary for the header and the data type: [str, DataType]. Default is None.

headless optionalboolean
  • False (default) - first row contains header information.
  • True - first row is included in your dataset.
skip_rows optionalint

The number of rows to skip before processing data. Default is none.

num_rows optionalint

The maximum number of rows to process. Default is all rows in a file.

ignore_empty_lines optionalboolean
  • False (default) - Empty lines are treated as errors.
  • True - Empty lines in the CSV file are ignored.
allow_missing_columns optionalboolean
  • False (default) - Missing columns are treated as errors.
  • True - Missing columns in rows are treated as empty strings.
ignore_excess_columns optionalboolean
  • False (default) - Extra columns in rows are treated as errors.
  • True - Excess columns in rows are ignored.
delimiter optionalchar

The delimiter for the file.

  • <delimiter> is the delimiter being used by the text file. Any non-newline string can be specified (i.e.,,, ;, :, \, |, etc.).
  • The default is ,.
quote optionalchar

The char surrounding a string value. Default is \".

ignore_surrounding_spaces optionalboolean

Trim leading and trailing blanks from non-quoted values. Default is True.

trim optionalboolean

Trim leading and trailing blanks from inside quoted values. Default is False.

charset optionalString

The character set. Default is utf-8.

csvSpecs optionalCsvSpecs

Specifications for how to load the CSV file.

note

Only one format parameter can be used at a time.

Returns

A new in-memory table from a CSV file.

Examples

note

In this guide, we read data from locations relative to the base of the Docker container. See Docker data volumes to learn more about the relation between locations in the container and the local file system.

In the following example, write_csv writes the source table to /data/file.csv, and read_csv loads the file into a Deephaven table.

from deephaven import new_table
from deephaven.column import string_col, int_col, double_col
from deephaven import read_csv, write_csv
from deephaven.constants import NULL_INT

source = new_table([
string_col("X", ["A", "B", None, "C", "B", "A", "B", "B", "C"]),
int_col("Y", [2, 4, 2, 1, 2, 3, 4, NULL_INT, 3]),
int_col("Z", [55, 76, 20, NULL_INT, 230, 50, 73, 137, 214]),
])

write_csv(source, "/data/file.csv")

result = read_csv("/data/file.csv")
note

In the following examples, the example data found in Deephaven's example repository will be used. Follow the instructions in the README to download the data to the proper location for use with Deephaven.

In the following example, read_csv is used to load the file DeNiro CSV into a Deephaven table.

from deephaven import read_csv
result = read_csv("https://media.githubusercontent.com/media/deephaven/examples/main/DeNiro/csv/deniro.csv")

Any character can be used as a delimiter. The pipe and tab characters (| and \t) are common. In the following example, the second input parameter is used to read pipe- and tab-delimited files into memory.

from deephaven import read_csv

result_psv = read_csv("https://raw.githubusercontent.com/deephaven/examples/main/DeNiro/csv/deniro.psv", delimiter="|")
result_tsv = read_csv("https://raw.githubusercontent.com/deephaven/examples/main/DeNiro/csv/deniro.tsv", delimiter="\t")