---
title: Schema overview
---

All Deephaven tables stored in the database (e.g., can be read through `db.live_table` in a query) have a schema that defines a table's namespace, name, column names, and data types. In addition to specifying the structure of the data, a schema can also include:

- Directives controlling how data is imported and stored, such as [encoding formats for String columns](../tables-and-schemas.md#columns) and [codecs for custom serialization of complex data types](../tables-and-schemas.md#column-codecs).
- [Metadata for data ingestion](./schema-overview.md#data-ingestion), such as custom DateTime converters.
- [Data validation rules](../../data-guide/validation.md#schema-based-validation) for ensuring data quality during a merge.

## Columns

Deephaven schemas define the names and data types for each column in a table. Below are some columns from the `DbInternal.AuditEventLog` table:

```xml
<Column name="Date" dataType="String" columnType="Partitioning" />
<Column name="Timestamp" dataType="DateTime" />
<Column name="ClientHost" dataType="String" />
<Column name="ClientPort" dataType="int" />
<Column name="Details" dataType="String" symbolTable="None" encoding="UTF_8" />
```

The `Column` element can also specify how the data is stored on disk. For example, the `DbInternal.AuditEventLog` table is [partitioned](../tables-and-schemas.md#partitions) on `Date`. See the full list of available column attributes [in the table and schemas concept guide](../tables-and-schemas.md#columns).

### Data types

Data types can generally be any Java class, such as Java primitive types, arrays of primitive types, and Strings. Column codecs provide custom serialization logic for complex data types. See [dataType](../tables-and-schemas.md#columns) for more information.

## Historical data

There are two main categories of data storage in Deephaven: [intraday and historical](../../legacy/importing-data/introduction.md#intraday-and-historical-data). Some historical storage options can be configured in the schema.

### Merge attributes

[Intraday data can be merged](../merging.md) to historical storage in Deephaven or [Apache Parquet](https://parquet.apache.org/) formats. When merging data to Parquet, a default compression codec can be chosen by adding a [`MergeAttributes`](../tables-and-schemas.md#merge-attributes) element with an appropriate Parquet-supported codec.

### Extended layouts

[Extended layouts](../tables-and-schemas.md#extended-layouts) are available for users with complex Parquet layouts that are created by other tools such as [Apache Hadoop](https://hadoop.apache.org). Extended layouts also allow you to use multiple partitioning columns.

## Data ingestion

Schemas can be extended with metadata to control how data is ingested into Deephaven. This includes:

- [DateTime converters for parsing date strings](../../data-guide/tables-and-schemas.md#custom-datetime-converters)
- Custom field writers for [importing data from CSV, JSON, JDBC, and XML files](../../data-guide/tables-and-schemas.md#importsource)
- [Data validation](../../data-guide/validation.md) rules for ensuring data quality during a merge

## Managing schema files:

Schemas are stored in etcd and can be imported to or exported from Deephaven using [`dhconfig schemas`](../../sys-admin/configuration/dhconfig/schemas.md). Special care must be taken when updating a schema during the ingestion window. See [Deploying Schemas during Intraday Data Ingestion](./schema-management.md#deploy-schemas-during-intraday-data-ingestion).

### Schema inference

Deephaven provides tools that can make writing new schemas easier by automatically inferring the schema from the data source. [Schema Inference](./schema-inference.md) is available for the following data sources:

- [CSV](../batch-data/csv.md)
- [JSON](../batch-data/json.md)
- [XML](../batch-data/xml.md)
- [JDBC](../batch-data/jdbc.md)
- [Avro](../../legacy/importing-data/kafka.md#discovering-a-deephaven-schema-from-an-avro-schema)
- [Protobuf](../../legacy/importing-data/kafka.md#discovering-a-deephaven-schema-from-a-protocol-buffer-descriptor)

### CopyTable schemas

One table layout may be used for multiple system tables. When this is required, it is not necessary to replicate the entire source schema definition for each new table. See [CopyTable](../tables-and-schemas.md#copytable-schemas) for more information.

## Related documentation

- [Data validation](../../data-guide/validation.md#schema-based-validation)
- [Merging data](../merging.md)
- [Schema Inference](./schema-inference.md)
- [Tables and schemas](../tables-and-schemas.md)
- [`dhconfig schemas`](../../sys-admin/configuration/dhconfig/schemas.md)