Import JDBC using Builder

JDBC imports may be performed from inside a Deephaven Groovy or Python script. Compared to other methods, this permits more elaborate logic with respect to existing data. These scripts may be executed as a persistent query or from the command line using the iris_exec run_local_script tool. All imports should be performed as the dbmerge user (or from a persistent query, running on a merge server).

Examples

The following script imports all the records from a SQL Server table into a single partition:

import com.illumon.iris.importers.util.JdbcImport
import com.illumon.iris.importers.ImportOutputMode

rows=new JdbcImport.Builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl( "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>")
    .setQuery("SELECT * FROM trade")
    .setDestinationPartitions("localhost/2018-05-01")
    .setStrict(false)
    .setOutputMode(ImportOutputMode.SAFE)
    .build()
    .run()

println "Imported " + rows + " rows."

from deephaven import *

rows = (
    JdbcImport.Builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl(
        "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>"
    )
    .setQuery("SELECT * FROM trade")
    .setDestinationPartitions("localhost/2018-05-01")
    .setStrict(False)
    .setOutputMode("SAFE")
    .build()
    .run()
)

print("Imported {} rows.".format(rows))

The following script imports all records into partitions based on the "Date" column, with the user and password values passed in via the iris_exec command line. This is a useful idiom when importing large datasets. Note that the input query includes a sort clause on the column being used for partitioning; this is important for optimal performance when performing multi-partition imports.

import com.illumon.iris.importers.util.JdbcImport
import com.illumon.iris.importers.ImportOutputMode

rows=new JdbcImport.builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl( "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>")
    .setUser(user)
    .setPassword(password)
    .setSourceTimeZone("UTC")
    .setQuery("SELECT * FROM table1 ORDER BY date")
    .setDestinationPartitions("localhost")
    .setPartitionColumn("Date")
    .build()
    .run()

println "Imported " + rows + " rows."

from deephaven import *

rows = (
    JdbcImport.builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl(
        "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>"
    )
    .setUser(user)
    .setPassword(password)
    .setSourceTimeZone("UTC")
    .setQuery("SELECT * FROM table1 ORDER BY date")
    .setDestinationPartitions("localhost")
    .setPartitionColumn("Date")
    .build()
    .run()
)

print("Imported {} rows.".format(rows))

Import API Reference

The JDBC import class provides a static builder method, which produces an object used to set parameters for the import. The builder returns an import object from the build() method. Imports are executed via the run() method and if successful, return the number of rows imported. All other parameters and options for the import are configured via the setter methods described below. The general pattern when scripting an import is:

nRows = JdbcImport.builder(<namespace>,<table>)
    .set<option>(<option value>)
    …
    .build()
    .run()

JDBC Import Options

Setter Method	Type	Req?	Default	Description
`setDestinationPartitions`	String	No*	N/A	The destination partition(s). e.g., "localhost/2018-05-01" for a single partition import or "localhost" for multi-partition import.
`setDestinationDirectory`	File \| String	No*	N/A	The destination directory. e.g., `/tmp/mylocation`.
`setOutputMode`	ImportOutputMode \| String	No	`SAFE`	Enumeration with the following options: `ImportOutputMode.SAFE \| ImportOutputMode.REPLACE \| ImportOutputMode.APPEND`. May also be specified as String (e.g.,`"SAFE"/"REPLACE"/"APPEND"`).
`setStrict`	boolean	No	`true`	If `true`, will allow target columns that are missing from the source to remain null, and will allow import to continue when data conversions fail. In most cases, this should be set to false only when developing the import process for a new data source.
`setPartitionColumn`	String	No	N/A	Column name to use to choose which partition to import each source row.
`setSourceName`	String	No	N/A	Specific `ImportSource` to use (from the table schema). If not specified, the importer will use the first `ImportSource` block that it finds that matches the type of the import (CSV/XML/JSON/JDBC).
`setDriver`	String	Yes	N/A	JDBC driver class; e.g., `com.microsoft.sqlserver.jdbc.SQLServerDriver` for SQL Server.
`setConnectionUrl`	String	Yes	N/A	JDBC connection string.
`setUser`	String	No	N/A	JDBC user (if not specified in connection string).
`setPassword`	String	No	N/A	JDBC password (if not specified in connection string).
`setQuery`	String	Yes	N/A	SQL query to execute for import.
`setSourceTimeZone`	TimeZone	No	Server TZ*	Time zone to be used when interpreting time & date values that do not have explicit offsets.

* Unless the source time zone is specified, the JDBC import task attempts to read the server time zone/UTC offset prior to executing the query. The source time zone is used to interpret SQL "datetime", "date", and "time-of-day" values, unless those types explicitly contain an offset. Since Deephaven presently has no "date" type, date values are interpreted as a datetime at midnight in the server time zone.