Package io.deephaven.kafka
Class CdcTools
java.lang.Object
io.deephaven.kafka.CdcTools
Utility class with methods to support consuming from a Change Data Capture (CDC) Kafka stream (as, eg, produced by
Debezium) to a Deephaven table.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic interface
Users specify CDC streams via objects satisfying this interface; the objects are created with static factory methods, the classes implementing this interface are opaque from a user perspective. -
Field Summary
Modifier and TypeFieldDescriptionstatic final String
The Value Kafka field in a CDC topic contains a sub field with a record (as nested fields) with values for all the columns for an updated (or added) row.static final String
The value for the sub-field that indicates the type of operation for when the operation type is delete.static final String
The name of the sub-field in the Value field that indicates the type of operation that triggered the CDC event (eg, insert, delete).static final String
The default topic name for a CDC topic corresponds to the strings for server name, database name, and table name, concatenated using this separator.static final String
The default schema name for the Key field of a given CDC topic is the topic name with this suffix added.static final String
The default schema name for the Value field of a given CDC topic is the topic name with this suffix added. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic String
cdcKeyAvroSchemaName
(String serverName, String dbName, String tableName) Build the default Key schema name for a CDC Stream, given server name, database name and table namestatic CdcTools.CdcSpec
cdcLongSpec
(String topic, String keySchemaName, String valueSchemaName) Create aCdcSpec
opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying topic name, and key and value schema names.static CdcTools.CdcSpec
cdcLongSpec
(String topic, String keySchemaName, String keySchemaVersion, String valueSchemaName, String valueSchemaVersion) Create aCdcSpec
opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying all configuration options.static CdcTools.CdcSpec
cdcShortSpec
(String serverName, String dbName, String tableName) Create aCdcSpec
opaque object (necessary for one argument in a call to consume*ToTable) in the debezium style, specifying server name, database name and table name.static String
cdcTopicName
(String serverName, String dbName, String tableName) Build the default CDC topic name given server name, database name and table namestatic String
cdcValueAvroSchemaName
(String serverName, String dbName, String tableName) Build the default Value schema name for a CDC Stream, given server name, database name and table namestatic Table
consumeRawToTable
(@NotNull Properties kafkaProperties, @NotNull CdcTools.CdcSpec cdcSpec, @NotNull IntPredicate partitionFilter, KafkaTools.TableType tableType) static Table
consumeToTable
(@NotNull Properties kafkaProperties, @NotNull CdcTools.CdcSpec cdcSpec, @NotNull IntPredicate partitionFilter) Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.static Table
consumeToTable
(@NotNull Properties kafkaProperties, @NotNull CdcTools.CdcSpec cdcSpec, @NotNull IntPredicate partitionFilter, boolean asBlinkTable, Collection<String> dropColumns) Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.
-
Field Details
-
KEY_AVRO_SCHEMA_SUFFIX
The default schema name for the Key field of a given CDC topic is the topic name with this suffix added.- See Also:
-
VALUE_AVRO_SCHEMA_SUFFIX
The default schema name for the Value field of a given CDC topic is the topic name with this suffix added.- See Also:
-
CDC_TOPIC_NAME_SEPARATOR
The default topic name for a CDC topic corresponds to the strings for server name, database name, and table name, concatenated using this separator.- See Also:
-
CDC_AFTER_COLUMN_PREFIX
The Value Kafka field in a CDC topic contains a sub field with a record (as nested fields) with values for all the columns for an updated (or added) row. This constant should match the path from the root, to this record. For instance, if the parent field is called "after", and the nested field separator is ".", this constant should be "after."- See Also:
-
CDC_OP_COLUMN_NAME
The name of the sub-field in the Value field that indicates the type of operation that triggered the CDC event (eg, insert, delete).- See Also:
-
CDC_DELETE_OP_VALUE
The value for the sub-field that indicates the type of operation for when the operation type is delete.- See Also:
-
-
Constructor Details
-
CdcTools
public CdcTools()
-
-
Method Details
-
cdcLongSpec
@ScriptApi public static CdcTools.CdcSpec cdcLongSpec(String topic, String keySchemaName, String valueSchemaName) Create aCdcSpec
opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying topic name, and key and value schema names.- Parameters:
topic
- The Kafka topic for the CDC events associated to the desired table data.keySchemaName
- The schema name for the Key Kafka field in the CDC events for the topic. This schema should include definitions for the columns forming the PRIMARY KEY of the underlying table.valueSchemaName
- The schema name for the Value Kafka field in the CDC events for the topic. This schema should include definitions for all the columns of the underlying table.- Returns:
- A CdcSpec object corresponding to the inputs; schema versions are implied to be latest.
-
cdcLongSpec
@ScriptApi public static CdcTools.CdcSpec cdcLongSpec(String topic, String keySchemaName, String keySchemaVersion, String valueSchemaName, String valueSchemaVersion) Create aCdcSpec
opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying all configuration options.- Parameters:
topic
- The Kafka topic for the CDC events associated to the desired table data.keySchemaName
- The schema name for the Key Kafka field in the CDC events for the topic. This schema should include definitions for the columns forming the PRIMARY KEY of the underlying table.keySchemaVersion
- The version for the Key schema to look up in schema server.valueSchemaName
- The schema name for the Value Kafka field in the CDC events for the topic. This schema should include definitions for all the columns of the underlying table.valueSchemaVersion
- The version for the Value schema to look up in schema server.- Returns:
- A CdcSpec object corresponding to the inputs.
-
cdcShortSpec
@ScriptApi public static CdcTools.CdcSpec cdcShortSpec(String serverName, String dbName, String tableName) Create aCdcSpec
opaque object (necessary for one argument in a call to consume*ToTable) in the debezium style, specifying server name, database name and table name. The topic name, and key and value schema names are implied by convention:- Topic is the concatenation of the arguments using "." as separator.
- Key schema name is topic with a "-key" suffix added.
- Value schema name is topic with a "-value" suffix added.
- Parameters:
serverName
- The server namedbName
- The database nametableName
- The table name- Returns:
- A CdcSpec object corresponding to the inputs.
-
consumeToTable
@ScriptApi public static Table consumeToTable(@NotNull @NotNull Properties kafkaProperties, @NotNull @NotNull CdcTools.CdcSpec cdcSpec, @NotNull @NotNull IntPredicate partitionFilter) Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.- Parameters:
kafkaProperties
- Properties to configure the associated kafka consumer and also the resulting table. Passed to the org.apache.kafka.clients.consumer.KafkaConsumer constructor; pass any KafkaConsumer specific desired configuration here. Note this should include the relevant property for a schema server URL where the key and/or value Avro necessary schemas are stored.cdcSpec
- A CdcSpec opaque object specifying the CDC Stream. Can be obtained from calling thecdcSpec
static factory method.partitionFilter
- A function specifying the desired initial offset for each partition consumed The convenience constantKafkaTools.ALL_PARTITIONS
is defined to facilitate requesting all partitions.- Returns:
- A Deephaven live table for underlying database table tracked by the CDC Stream
-
consumeToTable
@ScriptApi public static Table consumeToTable(@NotNull @NotNull Properties kafkaProperties, @NotNull @NotNull CdcTools.CdcSpec cdcSpec, @NotNull @NotNull IntPredicate partitionFilter, boolean asBlinkTable, Collection<String> dropColumns) Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.- Parameters:
kafkaProperties
- Properties to configure the associated kafka consumer and also the resulting table. Passed to the org.apache.kafka.clients.consumer.KafkaConsumer constructor; pass any KafkaConsumer specific desired configuration here. Note this should include the relevant property for a schema server URL where the key and/or value Avro necessary schemas are stored.cdcSpec
- A CdcSpec opaque object specifying the CDC Stream. Can be obtained from calling thecdcSpec
static factory method.partitionFilter
- A function specifying the desired initial offset for each partition consumed The convenience constantKafkaTools.ALL_PARTITIONS
is defined to facilitate requesting all partitions.asBlinkTable
- If true, return a blink table of row changes with an added 'op' column including the CDC operation affecting the row.dropColumns
- Collection of column names that will be dropped from the resulting table; null for none. Note that only columns not included in the primary key can be dropped at this stage; you can chain a drop column operation after this call if you need to drop primary key columns.- Returns:
- A Deephaven live table for underlying database table tracked by the CDC Stream
-
consumeRawToTable
@ScriptApi public static Table consumeRawToTable(@NotNull @NotNull Properties kafkaProperties, @NotNull @NotNull CdcTools.CdcSpec cdcSpec, @NotNull @NotNull IntPredicate partitionFilter, @NotNull KafkaTools.TableType tableType) -
cdcTopicName
Build the default CDC topic name given server name, database name and table name- Parameters:
serverName
- The server namedbName
- The database nametableName
- The table name- Returns:
- The default CDC topic name given inputs.
-
cdcKeyAvroSchemaName
@ScriptApi public static String cdcKeyAvroSchemaName(String serverName, String dbName, String tableName) Build the default Key schema name for a CDC Stream, given server name, database name and table name- Parameters:
serverName
- The server namedbName
- The database nametableName
- The table name- Returns:
- The default Key schema name for a CDC Stream given inputs.
-
cdcValueAvroSchemaName
@ScriptApi public static String cdcValueAvroSchemaName(String serverName, String dbName, String tableName) Build the default Value schema name for a CDC Stream, given server name, database name and table name- Parameters:
serverName
- The server namedbName
- The database nametableName
- The table name- Returns:
- The default Value schema name for a CDC Stream given inputs.
-