Update data routing configuration syntax

Newer versions of Deephaven include several data routing configuration syntax changes that make configuration easier. These upgrades do not generally require changes to existing configurations. However, taking advantage of new features might require updates to existing data routing configuration.

Endpoints

The current format gathers host, port, and related fields under an endpoint tag. The legacy format is incompatible with claims and several other tags.

The example below changes the legacy format to use endpoint, and also enables dynamic endpoints for the ingesters.

Change exclusion filters to claims

Claims are global filters that make it easy to configure data routing when a specific DIS is handling a table. Before claims were available, filters were used to configure these tables. Each new filter claiming a table required adjustments to other filters to exclude those same tables. Deephaven recommends making this adjustment as soon as routing changes are needed.

The examples below show a configuration with two custom ingesters, each handling all tables in one namespace each. The first example (using claims) shows the current syntax. The second example (using filters) shows how the complexity can grow quickly as additional ingesters are added.

Using claims
  dataImportServers:
    # The primary data import server
    db_dis:
      # import the default values
      <<: *DIS-default
      endpoint:
        host: *dh-import   # reference the address defined above for "dh-import"
        tableDataPort: *default-tableDataPort
        tailerPort: *default-tailerPort
      userIntradayDirectoryName: "IntradayUser"
      webServerParameters:
        enabled: true
        port: 8086
        authenticationRequired: false
        sslRequired: false
      filters: {whereTableKey: "Online"}

    Ingester1:
      endpoint:
        serviceRegistry: registry
      claims: {namespace: IngesterNamespaceOne}
      storage: Ingester1

    Ingester2:
      endpoint:
        serviceRegistry: registry
      claims: {namespace: IngesterNamespaceTwo}
      storage: Ingester2

  tableDataServices:
    db_tdcp:
      endpoint:
        host: *localhost
        port: *default-tableDataCacheProxyPort
      sources:
        - name: dataImportServers
Using filters
  dataImportServers:
    # The primary data import server
    db_dis:
      # import the default values
      <<: *DIS-default
      host: *dh-import   # reference the address defined above for "dh-import"
      userIntradayDirectoryName: "IntradayUser"
      webServerParameters:
        enabled: true
        port: 8086
        authenticationRequired: false
        sslRequired: false
      filters: {whereTableKey: "Online && NamespaceSet = `System` && !(Namespace == `IngesterNamespaceOne`) && !(Namespace == `IngesterNamespaceTwo`)"}

    db_rta:
      <<: *DIS-default
      host: *dh-import
      userIntradayDirectoryName: "IntradayUser"
      tailerPort: *default-tailerPort
      filters: {namespaceSet: User}

    Ingester1:
      host: *dh-import
      tailerPort: 22222
      throttleKbps: -1
      tableDataPort: 22223
      userIntradayDirectoryName: Users
      filters: {whereTableKey: "NamespaceSet = `System` && Namespace == `IngesterNamespaceOne`"}
      storage: Ingester1

    Ingester2:
      host: *dh-import
      tailerPort: 22224
      throttleKbps: -1
      tableDataPort: 22225
      userIntradayDirectoryName: Users
      filters: {whereTableKey: "NamespaceSet = `System` && Namespace == `IngesterNamespaceTwo`"}
      storage: Ingester2

  tableDataServices:
    db_tdcp:
      host: *localhost
      port: *default-tableDataCacheProxyPort
      sources:
        - name: db_dis
          filters: {whereTableKey: "Online && NamespaceSet = `System` && !(Namespace == `IngesterNamespaceOne`) && !(Namespace == `IngesterNamespaceTwo`)"}
        - name: db_rta
          filters: {namespaceSet: User}
        - name: Ingester1
          filters: {whereTableKey: "NamespaceSet = `System` && Online && Namespace == `IngesterNamespaceOne`"}
        - name: Ingester2
          filters: {whereTableKey: "NamespaceSet = `System` && Online && Namespace == `IngesterNamespaceTwo`"}

Table Data Cache Proxy (TCDP)

The tableDataServices section is the most complicated section of the routing configuration. Each key in this section defines a table data service (TDS), a server that provides data over the Table Data Protocol (TDP). These TDSes might be remote services, local services (read directly from disk), or compositions of other TDSes. The Data Routing Service can instantiate any of these defined services, but this is most often the query TDS. The query TDS generally routes offline data to a local service reading from shared storage, and online data to the TDCP. The TDCP composes all the online sources. The dataImportServers source includes all defined data import servers (and supports excluding specific ones). The claims keyword creates an implied global filter that either excludes all claimed tables, or includes only tables claimed by a specific DIS. When dataImportServers and claims are used in the routing configuration, adding or removing DIS instances frequently does not require further changes in the file.

When adding a new DIS, either convert the entire file to use claims, and remove the exclusion filters (recommended), or update the filters in the db_tdcp table data service and all other filters as needed to ensure data is requested from the new DIS, and is not requested from existing DISes. claims simplifies routing by namespace and table name (and therefore namespace set by inference) for online data. Frequently, no other filters are needed. Note that adding an explicit filter will override the implied claims filters, so another proxy layer might be needed to accomplish more complicated filtering.

The examples above include db_tdcp examples.

Remove db_rta

The Remote Table Appender (RTA) manages centrally managed user data. In most configurations, the RTA is just an alias for the main DIS, and the data routing configuration is configured to support creating a separate DIS to handle the User data. At runtime, the Data Routing Service detects that the db_dis and db_rta entries represent the same server and combines the filters. This is not supported when using claims or endpoints. Deephaven recommends combining these by removing the db_rta DIS, the filter on db_dis, and adjusting any filters addressing either or both DISes.

The examples above include removing db_rta.