Deephaven Data Lifecycle

The scaling of Deephaven to handle large data sets is mostly driven by the data lifecycle. Deephaven has been designed to separate the write-intensive applications (db_dis, importers) from the read/compute-intensive applications (db_query_server, db_query_workers, etc.).

The diagram below shows a generalized version of the processes responsible for handling data as part of the Deephaven engine. An external data source can be imported via a stream by generating binary logs fed to the Data Import Service (db_dis) process or by manually running an import using one of Deephaven's many importers. Once in the system, end-users can query either type via the db_query_server and its workers.

img

Two types of data

Deephaven views data as one of two types: intraday (near real-time) or historical. Each data type is stored in different locations in the database filesystem.

Intraday data

  • Intraday data is typically stored in /db/Intraday/<databaseNamespace>/<tableName>. When deploying servers, it is advised that each of these be on low-latency, high-speed disks connected either locally or via SAN. All reads and writes of this data are done through this mount point. Depending on data size and speed requirements, one or more mount points could be used at the /db/Intraday, /db/Intraday/<databaseNamespace>, or /db/Intraday/<databaseNamespace>/<tableName> levels.
  • The db_dis service reads/writes data from/to these directories.
  • If configured to run, the db_ltds service reads data from these directories.
  • If an administrator doesn't create mount points for new namespaces and/or tables before using them, Deephaven will automatically generate the required subdirectories when data is first written to the new tables.

Historical data

  • Historical data is stored in /db/Systems/<databaseNamespace>.
  • Intraday data is typically merged into historical data by a scheduled Merge PQ. It may also be merged by a Merge Script manually or via cron.
  • An attempted merge will fail if the required subdirectories don't exist.
  • Each historical database namespace directory contains two directories that the administrator must configure:
    • WritablePartitions is used for all writes to historical data.
    • Partitions is used for all reads from historical data.
    • The (historical) <databaseNamespace> is divided into a Partitions and WritablePartitions pair of directories. The subdirectories of these two will contain the data. Each of these subdirectories are either mounted shared volumes or links to mounted shared volumes. Partitions should contain a strict superset of WritablePartitions. It is recommended that each <databaseNamespace> be divided across multiple shared volumes to increase IO access to the data.
    • Initially, when historical partitions are created for a namespace, the WritablePartitions and Partitions subdirectories usually point to identical storage locations. For instance, with six partitions named "0" to "5", the WritablePartitions directory will contain six links named "0" to "5" that correspond to the respective Partitions directories. As the storage devices become full, more space is needed. To accommodate this, new directories (e.g., "6" to "11") can be created within Partitions, linking to new storage. The WritablePartitions links are then updated to reflect these new directories. This update involves deleting the old links in WritablePartitions and creating new links with the same names as the new Partitions directories. Consequently, the previously written historical data becomes read-only via the Partitions directory, while subsequent merges will write data to the newly allocated storage through the WritablePartition directory.
  • All volumes mounted under WritablePartitions and Partitions may be mounted on all Query and Merge servers. However, since these are divided by read and write functions, it is possible to have a Query Server that only had the read partitions mounted (Partitions) or a Merge Server with only the WritablePartitions mounted. Filesystem permissions could also be controlled in a like manner: the Partitions volumes only need to be mounted with read-only access. A server that only performs queries does not need anything under WritablePartitions, and does not need writable access to Partitions.

A large historical data installation will look like this:

img

Data lifecycle summary

  • Intraday disk volumes (or subdirectory partitions thereof) should be provided for each database namespace via local disk or SAN and be capable of handling the write and read requirements for the data set.
  • Intraday data is merged into historical data by a configured Merge process.
  • Once merged into historical data, intraday partitions may be removed from the intraday disk using a Data Validation PQ with the Delete intraday data flag set. It is also possible to remove intraday data with the Data control tool. Manually removing data from the filesystem is not recommended because doing so may cause an inconsistent state within the db_dis and/or db_tdcp services.
  • Historical shared (NFS) volumes (or subdirectory partitions thereof) should be provided for each database namespace via a shared filesystem that is mounted under /db/Systems/<databaseNamespace>/WritablePartitions and /db/Systems/<databaseNamespace>/Partitions on appropriate servers. Historical data for each database namespace uses WritablePartitions to merge new data and Partitions to read data.