Update data routing configuration syntax
Newer versions of Deephaven include several data routing configuration syntax changes that make configuration easier. These upgrades do not generally require changes to existing configurations. However, taking advantage of new features might require updates to existing data routing configuration.
Endpoints
The current format gathers host
, port
, and related fields under an endpoint
tag. The legacy format is incompatible with claims
and several other tags.
The example below changes the legacy format to use endpoint
, and also enables dynamic endpoints for the ingesters.
Change exclusion filters to claims
Claims are global filters that make it easy to configure data routing when a specific DIS is handling a table.
Before claims
were available, filters
were used to configure these tables. Each new filter claiming a table required adjustments to other filters to exclude those same tables.
Deephaven recommends making this adjustment as soon as routing changes are needed.
The examples below show a configuration with two custom ingesters, each handling all tables in one namespace each. The first example (using claims
) shows the current syntax.
The second example (using filters) shows how the complexity can grow quickly as additional ingesters are added.
Using claims
dataImportServers:
# The primary data import server
db_dis:
# import the default values
<<: *DIS-default
endpoint:
host: *dh-import # reference the address defined above for "dh-import"
tableDataPort: *default-tableDataPort
tailerPort: *default-tailerPort
userIntradayDirectoryName: "IntradayUser"
webServerParameters:
enabled: true
port: 8086
authenticationRequired: false
sslRequired: false
filters: {whereTableKey: "Online"}
Ingester1:
endpoint:
serviceRegistry: registry
claims: {namespace: IngesterNamespaceOne}
storage: Ingester1
Ingester2:
endpoint:
serviceRegistry: registry
claims: {namespace: IngesterNamespaceTwo}
storage: Ingester2
tableDataServices:
db_tdcp:
endpoint:
host: *localhost
port: *default-tableDataCacheProxyPort
sources:
- name: dataImportServers
Using filters
dataImportServers:
# The primary data import server
db_dis:
# import the default values
<<: *DIS-default
host: *dh-import # reference the address defined above for "dh-import"
userIntradayDirectoryName: "IntradayUser"
webServerParameters:
enabled: true
port: 8086
authenticationRequired: false
sslRequired: false
filters: {whereTableKey: "Online && NamespaceSet = `System` && !(Namespace == `IngesterNamespaceOne`) && !(Namespace == `IngesterNamespaceTwo`)"}
db_rta:
<<: *DIS-default
host: *dh-import
userIntradayDirectoryName: "IntradayUser"
tailerPort: *default-tailerPort
filters: {namespaceSet: User}
Ingester1:
host: *dh-import
tailerPort: 22222
throttleKbps: -1
tableDataPort: 22223
userIntradayDirectoryName: Users
filters: {whereTableKey: "NamespaceSet = `System` && Namespace == `IngesterNamespaceOne`"}
storage: Ingester1
Ingester2:
host: *dh-import
tailerPort: 22224
throttleKbps: -1
tableDataPort: 22225
userIntradayDirectoryName: Users
filters: {whereTableKey: "NamespaceSet = `System` && Namespace == `IngesterNamespaceTwo`"}
storage: Ingester2
tableDataServices:
db_tdcp:
host: *localhost
port: *default-tableDataCacheProxyPort
sources:
- name: db_dis
filters: {whereTableKey: "Online && NamespaceSet = `System` && !(Namespace == `IngesterNamespaceOne`) && !(Namespace == `IngesterNamespaceTwo`)"}
- name: db_rta
filters: {namespaceSet: User}
- name: Ingester1
filters: {whereTableKey: "NamespaceSet = `System` && Online && Namespace == `IngesterNamespaceOne`"}
- name: Ingester2
filters: {whereTableKey: "NamespaceSet = `System` && Online && Namespace == `IngesterNamespaceTwo`"}
Table Data Cache Proxy (TCDP)
The tableDataServices
section is the most complicated section of the routing configuration. Each key in this section defines a table data service (TDS), a server that provides data over the Table Data Protocol (TDP). These TDSes might be remote services, local services (read directly from disk), or compositions of other TDSes. The Data Routing Service can instantiate any of these defined services, but this is most often the query
TDS. The query
TDS generally routes offline data to a local service reading from shared storage, and online data to the TDCP. The TDCP composes all the online sources. The dataImportServers
source includes all defined data import servers (and supports excluding specific ones). The claims
keyword creates an implied global filter that either excludes all claimed tables, or includes only tables claimed by a specific DIS. When dataImportServers
and claims
are used in the routing configuration, adding or removing DIS instances frequently does not require further changes in the file.
When adding a new DIS, either convert the entire file to use claims
, and remove the exclusion filters (recommended), or update the filters in the db_tdcp
table data service and all other filters as needed to ensure data is requested from the new DIS, and is not requested from existing DISes. claims
simplifies routing by namespace and table name (and therefore namespace set by inference) for online data. Frequently, no other filters are needed. Note that adding an explicit filter will override the implied claims filters, so another proxy layer might be needed to accomplish more complicated filtering.
The examples above include db_tdcp
examples.
Remove db_rta
The Remote Table Appender (RTA) manages centrally managed user data. In most configurations, the RTA is just an alias for the main DIS, and the data routing configuration is configured to support creating a separate DIS to handle the User data. At runtime, the Data Routing Service detects that the db_dis
and db_rta
entries represent the same server and combines the filters. This is not supported when using claims
or endpoints
. Deephaven recommends combining these by removing the db_rta
DIS, the filter on db_dis
, and adjusting any filters addressing either or both DISes.
The examples above include removing db_rta
.