Import JDBC using Builder

JDBC imports may be performed from inside a Deephaven Groovy or Python script. Compared to other methods, this permits more elaborate logic with respect to existing data. These scripts may be executed as a persistent query or from the command line using the iris_exec run_local_script tool. All imports should be performed as the dbmerge user (or from a persistent query, running on a merge server).

Examples

The following script imports all the records from a SQL Server table into a single partition:

import com.illumon.iris.importers.util.JdbcImport
import com.illumon.iris.importers.ImportOutputMode

rows=new JdbcImport.Builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl( "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>")
    .setQuery("SELECT * FROM trade")
    .setDestinationPartitions("localhost/2018-05-01")
    .setStrict(false)
    .setOutputMode(ImportOutputMode.SAFE)
    .build()
    .run()

println "Imported " + rows + " rows."
from deephaven import *

rows = (
    JdbcImport.Builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl(
        "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>"
    )
    .setQuery("SELECT * FROM trade")
    .setDestinationPartitions("localhost/2018-05-01")
    .setStrict(False)
    .setOutputMode("SAFE")
    .build()
    .run()
)

print("Imported {} rows.".format(rows))

The following script imports all records into partitions based on the "Date" column, with the user and password values passed in via the iris_exec command line. This is a useful idiom when importing large datasets. Note that the input query includes a sort clause on the column being used for partitioning; this is important for optimal performance when performing multi-partition imports.

import com.illumon.iris.importers.util.JdbcImport
import com.illumon.iris.importers.ImportOutputMode

rows=new JdbcImport.builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl( "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>")
    .setUser(user)
    .setPassword(password)
    .setSourceTimeZone("UTC")
    .setQuery("SELECT * FROM table1 ORDER BY date")
    .setDestinationPartitions("localhost")
    .setPartitionColumn("Date")
    .build()
    .run()

println "Imported " + rows + " rows."
from deephaven import *

rows = (
    JdbcImport.builder("ExNamespace", "Trade")
    .setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
    .setConnectionUrl(
        "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>"
    )
    .setUser(user)
    .setPassword(password)
    .setSourceTimeZone("UTC")
    .setQuery("SELECT * FROM table1 ORDER BY date")
    .setDestinationPartitions("localhost")
    .setPartitionColumn("Date")
    .build()
    .run()
)

print("Imported {} rows.".format(rows))

Import API Reference

The JDBC import class provides a static builder method, which produces an object used to set parameters for the import. The builder returns an import object from the build() method. Imports are executed via the run() method and if successful, return the number of rows imported. All other parameters and options for the import are configured via the setter methods described below. The general pattern when scripting an import is:

nRows = JdbcImport.builder(<namespace>,<table>)
    .set<option>(<option value>)

    .build()
    .run()

JDBC Import Options

Setter MethodTypeReq?DefaultDescription
setDestinationPartitionsStringNo*N/AThe destination partition(s). e.g., "localhost/2018-05-01" for a single partition import or "localhost" for multi-partition import.
setDestinationDirectoryFile | StringNo*N/AThe destination directory. e.g., /tmp/mylocation.
setOutputModeImportOutputMode | StringNoSAFEEnumeration with the following options: ImportOutputMode.SAFE | ImportOutputMode.REPLACE | ImportOutputMode.APPEND. May also be specified as String (e.g.,"SAFE"/"REPLACE"/"APPEND").
setStrictbooleanNotrueIf true, will allow target columns that are missing from the source to remain null, and will allow import to continue when data conversions fail. In most cases, this should be set to false only when developing the import process for a new data source.
setPartitionColumnStringNoN/AColumn name to use to choose which partition to import each source row.
setSourceNameStringNoN/ASpecific ImportSource to use (from the table schema). If not specified, the importer will use the first ImportSource block that it finds that matches the type of the import (CSV/XML/JSON/JDBC).
setDriverStringYesN/AJDBC driver class; e.g., com.microsoft.sqlserver.jdbc.SQLServerDriver for SQL Server.
setConnectionUrlStringYesN/AJDBC connection string.
setUserStringNoN/AJDBC user (if not specified in connection string).
setPasswordStringNoN/AJDBC password (if not specified in connection string).
setQueryStringYesN/ASQL query to execute for import.
setSourceTimeZoneTimeZoneNoServer TZ*Time zone to be used when interpreting time & date values that do not have explicit offsets.

* Unless the source time zone is specified, the JDBC import task attempts to read the server time zone/UTC offset prior to executing the query. The source time zone is used to interpret SQL "datetime", "date", and "time-of-day" values, unless those types explicitly contain an offset. Since Deephaven presently has no "date" type, date values are interpreted as a datetime at midnight in the server time zone.