Skip to main content

Database connectivity comes to Deephaven

· 2 min read
DALL·E prompt: A blue elephant standing on a green field with a bunch of arrows clutched by it's trunk, digital art
JJ Brosnan
Leverage ADBC and ODBC for fast data extraction

Relational databases have become the backbone of entire industries since their inception. ETL pipelines that turn the gears of the corporate world would crumble without them. Deephaven has transformed the T in ETL (transforming transforms?), making it easier than ever to manipulate complex data in both static and real-time contexts all the same. Why not cover the E and the L, too? Well, now Deephaven does.

Deephaven's v0.21 release brought the first Deephaven ODBC and ADBC client to the Python API. In this blog, I'll show you how simple it is to connect to and ingest data from a relational database using Deephaven's API.

The Deephaven ODBC and ADBC clients

ODBC makes accessing databases simple. Deephaven, a column-oriented query engine, is well-suited to source data from relational databases. Deephaven also relies heavily on Apache Arrow Flight to power the table engine. The Arrow Database Connectivity (ADBC) API is a natural fit for Deephaven as well. Start by using pip to install the necessary drivers:

pip install adbc_driver_manager adbc_driver_postgresql

Then, in Deephaven, just import the packages:

from deephaven.dbc import adbc as dhadbc
from deephaven.dbc import odbc as dhodbc

Usage

Using Deephaven's ADBC and ODBC clients is intuitive. If you have access to a relational database from your Deephaven instance, you need three things:

  1. The import statement
  2. The URI of the database
  3. A SQL query to execute

The workflows for both ADBC and ODBC are the same:

  1. Connect to the database.
  2. Execute a query in a cursor.
  3. Read the cursor.

What does that look like in a real example? Let's look at an ADBC query:

from deephaven.dbc import adbc as dhadbc
from adbc_driver_postgresql import dbapi

import os

username = os.environ["POSTGRES_USERNAME"]
password = os.environ["POSTGRES_PASSWORD"]
url = os.environ["POSTGRES_URL"]
port = os.environ["POSTGRES_PORT"]

sql_query = "SELECT t_ts as Timestamp, CAST(t_id AS text) as Id, " +
"CAST(t_instrument as text) as Instrument, " +
"t_exchange as Exchange, t_price as Price, t_size as Size " +
"FROM CRYPTO TRADES"

uri = f"{url}:{port}/postgres?user={username}&password={password}"

with dbapi.connect(uri) as conn:
with conn.cursor() as cursor:
cursor.execute(sql_query)
table = dhadbc.read_cursor(cursor)

img

That's all there is to it.

If your database only supports ODBC/JDBC, the Deephaven ADBC client can still be used. This flexibility provides increased efficiency when extracting data from relational database systems.

Reach out

Our Slack community continues to grow. Join us!