Class CdcTools

java.lang.Object
io.deephaven.kafka.CdcTools

public class CdcTools extends Object
Utility class with methods to support consuming from a Change Data Capture (CDC) Kafka stream (as, eg, produced by Debezium) to a Deephaven table.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static interface 
    Users specify CDC streams via objects satisfying this interface; the objects are created with static factory methods, the classes implementing this interface are opaque from a user perspective.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
    The Value Kafka field in a CDC topic contains a sub field with a record (as nested fields) with values for all the columns for an updated (or added) row.
    static final String
    The value for the sub-field that indicates the type of operation for when the operation type is delete.
    static final String
    The name of the sub-field in the Value field that indicates the type of operation that triggered the CDC event (eg, insert, delete).
    static final String
    The default topic name for a CDC topic corresponds to the strings for server name, database name, and table name, concatenated using this separator.
    static final String
    The default schema name for the Key field of a given CDC topic is the topic name with this suffix added.
    static final String
    The default schema name for the Value field of a given CDC topic is the topic name with this suffix added.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static String
    cdcKeyAvroSchemaName(String serverName, String dbName, String tableName)
    Build the default Key schema name for a CDC Stream, given server name, database name and table name
    cdcLongSpec(String topic, String keySchemaName, String valueSchemaName)
    Create a CdcSpec opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying topic name, and key and value schema names.
    cdcLongSpec(String topic, String keySchemaName, String keySchemaVersion, String valueSchemaName, String valueSchemaVersion)
    Create a CdcSpec opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying all configuration options.
    cdcShortSpec(String serverName, String dbName, String tableName)
    Create a CdcSpec opaque object (necessary for one argument in a call to consume*ToTable) in the debezium style, specifying server name, database name and table name.
    static String
    cdcTopicName(String serverName, String dbName, String tableName)
    Build the default CDC topic name given server name, database name and table name
    static String
    cdcValueAvroSchemaName(String serverName, String dbName, String tableName)
    Build the default Value schema name for a CDC Stream, given server name, database name and table name
    static Table
    consumeRawToTable(@NotNull Properties kafkaProperties, @NotNull CdcTools.CdcSpec cdcSpec, @NotNull IntPredicate partitionFilter, KafkaTools.TableType tableType)
     
    static Table
    consumeToTable(@NotNull Properties kafkaProperties, @NotNull CdcTools.CdcSpec cdcSpec, @NotNull IntPredicate partitionFilter)
    Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.
    static Table
    consumeToTable(@NotNull Properties kafkaProperties, @NotNull CdcTools.CdcSpec cdcSpec, @NotNull IntPredicate partitionFilter, boolean asBlinkTable, Collection<String> dropColumns)
    Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • KEY_AVRO_SCHEMA_SUFFIX

      public static final String KEY_AVRO_SCHEMA_SUFFIX
      The default schema name for the Key field of a given CDC topic is the topic name with this suffix added.
      See Also:
    • VALUE_AVRO_SCHEMA_SUFFIX

      public static final String VALUE_AVRO_SCHEMA_SUFFIX
      The default schema name for the Value field of a given CDC topic is the topic name with this suffix added.
      See Also:
    • CDC_TOPIC_NAME_SEPARATOR

      public static final String CDC_TOPIC_NAME_SEPARATOR
      The default topic name for a CDC topic corresponds to the strings for server name, database name, and table name, concatenated using this separator.
      See Also:
    • CDC_AFTER_COLUMN_PREFIX

      public static final String CDC_AFTER_COLUMN_PREFIX
      The Value Kafka field in a CDC topic contains a sub field with a record (as nested fields) with values for all the columns for an updated (or added) row. This constant should match the path from the root, to this record. For instance, if the parent field is called "after", and the nested field separator is ".", this constant should be "after."
      See Also:
    • CDC_OP_COLUMN_NAME

      public static final String CDC_OP_COLUMN_NAME
      The name of the sub-field in the Value field that indicates the type of operation that triggered the CDC event (eg, insert, delete).
      See Also:
    • CDC_DELETE_OP_VALUE

      public static final String CDC_DELETE_OP_VALUE
      The value for the sub-field that indicates the type of operation for when the operation type is delete.
      See Also:
  • Constructor Details

    • CdcTools

      public CdcTools()
  • Method Details

    • cdcLongSpec

      @ScriptApi public static CdcTools.CdcSpec cdcLongSpec(String topic, String keySchemaName, String valueSchemaName)
      Create a CdcSpec opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying topic name, and key and value schema names.
      Parameters:
      topic - The Kafka topic for the CDC events associated to the desired table data.
      keySchemaName - The schema name for the Key Kafka field in the CDC events for the topic. This schema should include definitions for the columns forming the PRIMARY KEY of the underlying table.
      valueSchemaName - The schema name for the Value Kafka field in the CDC events for the topic. This schema should include definitions for all the columns of the underlying table.
      Returns:
      A CdcSpec object corresponding to the inputs; schema versions are implied to be latest.
    • cdcLongSpec

      @ScriptApi public static CdcTools.CdcSpec cdcLongSpec(String topic, String keySchemaName, String keySchemaVersion, String valueSchemaName, String valueSchemaVersion)
      Create a CdcSpec opaque object (necessary for one argument in a call to consume*ToTable) via explicitly specifying all configuration options.
      Parameters:
      topic - The Kafka topic for the CDC events associated to the desired table data.
      keySchemaName - The schema name for the Key Kafka field in the CDC events for the topic. This schema should include definitions for the columns forming the PRIMARY KEY of the underlying table.
      keySchemaVersion - The version for the Key schema to look up in schema server.
      valueSchemaName - The schema name for the Value Kafka field in the CDC events for the topic. This schema should include definitions for all the columns of the underlying table.
      valueSchemaVersion - The version for the Value schema to look up in schema server.
      Returns:
      A CdcSpec object corresponding to the inputs.
    • cdcShortSpec

      @ScriptApi public static CdcTools.CdcSpec cdcShortSpec(String serverName, String dbName, String tableName)
      Create a CdcSpec opaque object (necessary for one argument in a call to consume*ToTable) in the debezium style, specifying server name, database name and table name. The topic name, and key and value schema names are implied by convention:
      • Topic is the concatenation of the arguments using "." as separator.
      • Key schema name is topic with a "-key" suffix added.
      • Value schema name is topic with a "-value" suffix added.
      Parameters:
      serverName - The server name
      dbName - The database name
      tableName - The table name
      Returns:
      A CdcSpec object corresponding to the inputs.
    • consumeToTable

      @ScriptApi public static Table consumeToTable(@NotNull @NotNull Properties kafkaProperties, @NotNull @NotNull CdcTools.CdcSpec cdcSpec, @NotNull @NotNull IntPredicate partitionFilter)
      Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.
      Parameters:
      kafkaProperties - Properties to configure the associated kafka consumer and also the resulting table. Passed to the org.apache.kafka.clients.consumer.KafkaConsumer constructor; pass any KafkaConsumer specific desired configuration here. Note this should include the relevant property for a schema server URL where the key and/or value Avro necessary schemas are stored.
      cdcSpec - A CdcSpec opaque object specifying the CDC Stream. Can be obtained from calling the cdcSpec static factory method.
      partitionFilter - A function specifying the desired initial offset for each partition consumed The convenience constant KafkaTools.ALL_PARTITIONS is defined to facilitate requesting all partitions.
      Returns:
      A Deephaven live table for underlying database table tracked by the CDC Stream
    • consumeToTable

      @ScriptApi public static Table consumeToTable(@NotNull @NotNull Properties kafkaProperties, @NotNull @NotNull CdcTools.CdcSpec cdcSpec, @NotNull @NotNull IntPredicate partitionFilter, boolean asBlinkTable, Collection<String> dropColumns)
      Consume from a CDC Kafka Event Stream to a DHC ticking table, recreating the underlying database table.
      Parameters:
      kafkaProperties - Properties to configure the associated kafka consumer and also the resulting table. Passed to the org.apache.kafka.clients.consumer.KafkaConsumer constructor; pass any KafkaConsumer specific desired configuration here. Note this should include the relevant property for a schema server URL where the key and/or value Avro necessary schemas are stored.
      cdcSpec - A CdcSpec opaque object specifying the CDC Stream. Can be obtained from calling the cdcSpec static factory method.
      partitionFilter - A function specifying the desired initial offset for each partition consumed The convenience constant KafkaTools.ALL_PARTITIONS is defined to facilitate requesting all partitions.
      asBlinkTable - If true, return a blink table of row changes with an added 'op' column including the CDC operation affecting the row.
      dropColumns - Collection of column names that will be dropped from the resulting table; null for none. Note that only columns not included in the primary key can be dropped at this stage; you can chain a drop column operation after this call if you need to drop primary key columns.
      Returns:
      A Deephaven live table for underlying database table tracked by the CDC Stream
    • consumeRawToTable

      @ScriptApi public static Table consumeRawToTable(@NotNull @NotNull Properties kafkaProperties, @NotNull @NotNull CdcTools.CdcSpec cdcSpec, @NotNull @NotNull IntPredicate partitionFilter, @NotNull KafkaTools.TableType tableType)
    • cdcTopicName

      @ScriptApi public static String cdcTopicName(String serverName, String dbName, String tableName)
      Build the default CDC topic name given server name, database name and table name
      Parameters:
      serverName - The server name
      dbName - The database name
      tableName - The table name
      Returns:
      The default CDC topic name given inputs.
    • cdcKeyAvroSchemaName

      @ScriptApi public static String cdcKeyAvroSchemaName(String serverName, String dbName, String tableName)
      Build the default Key schema name for a CDC Stream, given server name, database name and table name
      Parameters:
      serverName - The server name
      dbName - The database name
      tableName - The table name
      Returns:
      The default Key schema name for a CDC Stream given inputs.
    • cdcValueAvroSchemaName

      @ScriptApi public static String cdcValueAvroSchemaName(String serverName, String dbName, String tableName)
      Build the default Value schema name for a CDC Stream, given server name, database name and table name
      Parameters:
      serverName - The server name
      dbName - The database name
      tableName - The table name
      Returns:
      The default Value schema name for a CDC Stream given inputs.