Validate and transform data

img

All new data added to a Deephaven installation is first stored in the Intraday database. Typically this data is periodically reorganized and merged into the Historical database where it is stored for long-term use. The easiest way to accomplish a merge is via the "Data Merge" persistent query type. These merge queries take advantage of the same scheduling, execution history, and dependency chaining as other persistent queries.

The data merge step by itself only handles reorganizing the data and copying it from intraday to historical. It does not validate or remove the intraday version of the partition. Validation of merged data, which may optionally include removal of the source intraday data, can be accomplished via a "Data Validation" persistent query, which is dependent on the success of a merge query. As with other import-related tasks, validation may also be done via the command line or manual scripting.

A typical data "lifecycle" consists of some form of data ingestion to intraday, followed by merge, and validation. After successful data validation, the intraday data is deleted, including the directory that contained the intraday partition.

Note

See also: Data Validation and Merging Data