Import JDBC using Builder
JDBC imports may be performed from inside a Deephaven Groovy or Python script. Compared to other methods, this permits more elaborate logic with respect to existing data. These scripts may be executed as a persistent query or from the command line using the iris_exec run_local_script
tool. All imports should be performed as the dbmerge user (or from a persistent query, running on a merge server).
Examples
The following script imports all the records from a SQL Server table into a single partition:
import com.illumon.iris.importers.util.JdbcImport
import com.illumon.iris.importers.ImportOutputMode
rows=new JdbcImport.Builder("ExNamespace", "Trade")
.setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
.setConnectionUrl( "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>")
.setQuery("SELECT * FROM trade")
.setDestinationPartitions("localhost/2018-05-01")
.setStrict(false)
.setOutputMode(ImportOutputMode.SAFE)
.build()
.run()
println "Imported " + rows + " rows."
from deephaven import *
rows = (
JdbcImport.Builder("ExNamespace", "Trade")
.setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
.setConnectionUrl(
"jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>"
)
.setQuery("SELECT * FROM trade")
.setDestinationPartitions("localhost/2018-05-01")
.setStrict(False)
.setOutputMode("SAFE")
.build()
.run()
)
print("Imported {} rows.".format(rows))
The following script imports all records into partitions based on the "Date" column, with the user and password values passed in via the iris_exec
command line. This is a useful idiom when importing large datasets. Note that the input query includes a sort clause on the column being used for partitioning; this is important for optimal performance when performing multi-partition imports.
import com.illumon.iris.importers.util.JdbcImport
import com.illumon.iris.importers.ImportOutputMode
rows=new JdbcImport.builder("ExNamespace", "Trade")
.setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
.setConnectionUrl( "jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>")
.setUser(user)
.setPassword(password)
.setSourceTimeZone("UTC")
.setQuery("SELECT * FROM table1 ORDER BY date")
.setDestinationPartitions("localhost")
.setPartitionColumn("Date")
.build()
.run()
println "Imported " + rows + " rows."
from deephaven import *
rows = (
JdbcImport.builder("ExNamespace", "Trade")
.setDriver("com.microsoft.sqlserver.jdbc.SQLServerDriver")
.setConnectionUrl(
"jdbc:sqlserver://<host>;database=<db>;user=<user>;password=<pwd>"
)
.setUser(user)
.setPassword(password)
.setSourceTimeZone("UTC")
.setQuery("SELECT * FROM table1 ORDER BY date")
.setDestinationPartitions("localhost")
.setPartitionColumn("Date")
.build()
.run()
)
print("Imported {} rows.".format(rows))
Import API Reference
The JDBC import class provides a static builder method, which produces an object used to set parameters for the import. The builder returns an import object from the build()
method. Imports are executed via the run()
method and if successful, return the number of rows imported. All other parameters and options for the import are configured via the setter methods described below. The general pattern when scripting an import is:
nRows = JdbcImport.builder(<namespace>,<table>)
.set<option>(<option value>)
…
.build()
.run()
JDBC Import Options
Setter Method | Type | Req? | Default | Description |
---|---|---|---|---|
setDestinationPartitions | String | No* | N/A | The destination partition(s). e.g., "localhost/2018-05-01" for a single partition import or "localhost" for multi-partition import. |
setDestinationDirectory | File | String | No* | N/A | The destination directory. e.g., /tmp/mylocation . |
setOutputMode | ImportOutputMode | String | No | SAFE | Enumeration with the following options: ImportOutputMode.SAFE | ImportOutputMode.REPLACE | ImportOutputMode.APPEND . May also be specified as String (e.g.,"SAFE"/"REPLACE"/"APPEND" ). |
setStrict | boolean | No | true | If true , will allow target columns that are missing from the source to remain null, and will allow import to continue when data conversions fail. In most cases, this should be set to false only when developing the import process for a new data source. |
setPartitionColumn | String | No | N/A | Column name to use to choose which partition to import each source row. |
setSourceName | String | No | N/A | Specific ImportSource to use (from the table schema). If not specified, the importer will use the first ImportSource block that it finds that matches the type of the import (CSV/XML/JSON/JDBC). |
setDriver | String | Yes | N/A | JDBC driver class; e.g., com.microsoft.sqlserver.jdbc.SQLServerDriver for SQL Server. |
setConnectionUrl | String | Yes | N/A | JDBC connection string. |
setUser | String | No | N/A | JDBC user (if not specified in connection string). |
setPassword | String | No | N/A | JDBC password (if not specified in connection string). |
setQuery | String | Yes | N/A | SQL query to execute for import. |
setSourceTimeZone | TimeZone | No | Server TZ* | Time zone to be used when interpreting time & date values that do not have explicit offsets. |
* Unless the source time zone is specified, the JDBC import task attempts to read the server time zone/UTC offset prior to executing the query. The source time zone is used to interpret SQL "datetime", "date", and "time-of-day" values, unless those types explicitly contain an offset. Since Deephaven presently has no "date" type, date values are interpreted as a datetime at midnight in the server time zone.