Use the data control tool from a Legacy Groovy worker

Warning

Legacy documentation: This documentation applies to Legacy Deephaven Enterprise only and does not apply to Core+.

Tip

For Core+ workers (Groovy or Python), see Data control from scripts.

The data control tool (dhctl) is a command-line tool that allows you to manage data in Deephaven. This functionality is also available from any Legacy worker with sufficient permissions.

Truncate and delete intraday partitions

Truncate and delete are configured by an Options value built via Options.builder(). Build an Options and pass it to IntradayControl.truncateIntradayPartition(options) or IntradayControl.deleteIntradayPartition(options).

Tip

You will want to check the results of these commands.

Options Builder

The options builder can be useful when programmatically constructing complex truncate or delete commands.

Modify the builder with the desired options, much like the dhctl command line options.

Set the partition to be deleted:

For example:

Dry run options

Change the dry run option:

Add a DIS

Add a DIS to the include or exclude list:

Build options

Build the options as configured in the builder:

Pass the options to the Intraday Control tool and check the results:

You can display the contents of the options:

Check the results

Truncate and delete commands return DisCommandResult objects containing detailed information about the operation. The intraday operations can be complex, so the result object is also necessarily complex. The DisCommandResult object has an overall result that indicates success or failure of the operation as a whole. It also contains a map of results for each DIS that was involved in the operation. There is an overall result code for each DIS, and a collection of results for the individual locations that were processed. The code examples below illustrate how to check the results at various levels of detail. These are intended as examples; adjust the code to suit your needs.

This example executes a truncate dry run, and then executes the truncate only if none of the locations to be truncated are still in use:

This example executes a truncate command and then, if the truncate was successful, delete:

Rescan tables

This command instructs the DIS handling table DbInternal.ProcessEventLog to look for new data:

Caution

Always specify a --table-name when running rescan. Rescans should target specific tables for specific purposes. Omitting this option instructs DISes to rescan all intraday tables, which is disruptive and rarely the intended action. This is especially important in scheduled or scripted scenarios — a blanket rescan can cause unnecessary load and delays.

The following example shows rescanning all tables on all DISs. This is not recommended.

Caveats

  • This method makes a best-effort attempt to delete everything on all appropriate Data Import Servers. This cannot be atomic, so the operation might have only partial success. Make sure you check all the results.
  • The truncated partitions are marked as permanently truncated, and further ingestion of data will be disallowed. This is to prevent confusion if loggers produce new data for the partition, or if tailers have not finished all existing data files.
  • Before logging new data for the truncated partitions, remove any existing data files (bin files), and then delete the partitions with dhctl intraday delete ....
  • It is possible to delete data on one Data Import Server and leave it on another (e.g., a backup). Be extremely careful with this, as it can create confusion.