Serious
real‑time
data tools
Data System
Data Sources
Access and ingest data directly from popular, standard formats. For example, use a Kafka event stream alongside historical Parquet data.
Data Processing
Stream updating and real-time derived data to consumers. Connect JavaScript, Python, Java, and C++ clients and receive live updates or snapshots. Write to persistent stores. Build and share real-time visualizations and monitors. Explore massive and ticking datasets with built-in tools. Connect enterprise apps.
Data Consumers
Exhaust new streams or write to persistent stores, build and share real-time visualizations and monitors. Explore massive and ticking datasets with built in tools. Build enterprise apps.
Why Deephaven?
Streaming data
done right
Serious
performance
Engineered to track table additions, removals, modifications, and shifts, users benefit from Deephaven’s highly-optimized, incremental-update model. A chunk-oriented architecture delivers best-of-class table methods and amortizes the cost of moving between languages.
Client-server interfaces are designed with large-scale, dense data in mind -- moving compute to the server and providing lazy updates.
Build, join, and publish streams with ease
Build streams on streams to empower applications and do analysis. Use table operations or marry them to custom and third-party libraries. Query and combine batch and real-time data.
Highly
intuitive
New data and events seamlessly arrive as simple table updates. Queries establish an acyclic graph, with data logically flowing to downstream nodes. Simply name a source or derived table to make it available to clients via multi-language APIs. Use easy methods to stripe and pipeline workloads.
Familiar &
powerful tools
Leverage gRPC and Arrow. Use Jupyter, Visual Studio, JetBrains, or [soon] R Studio. Bring your custom or 3rd-party libraries and functions to the data for faster and well-integrated execution. Enjoy the data interrogation experiences of the Code Studio, with dynamic dashboards and an evolving suite of capabilities.
Expressive Language
Built for Developers, loved by Data Scientists
from deephaven import ConsumeKafka, ParquetTools, TableTools
from deephaven2.parquet import read_table
# data-ingestion integrations (Kafka, Parquet, and many more)
table_today_live = ConsumeKafka.consume(
{"bootstrap.servers": "kafka:9092"}, "metrics"
)
table_yesterday = ParquetTools.readTable("/data/metrics.parquet")
# merging dynamic with static is easy; the updating table will continue to update
table_merged = TableTools.merge(table_today_live, table_yesterday)
# operators can be used identically on dynamic and static tables (or merges of the two)
table_joined = table_today_live.sumBy("ProcessKey").naturalJoin(
table_yesterday.sumBy("ProcessKey"), "ProcessKey", "YestTotal = Metric"
)
bitcoin = ConsumeKafka.consume({"bootstrap.servers": "kafka:9092"}, "bitcoin")
ethereum = ConsumeKafka.consume({"bootstrap.servers": "kafka:9092"}, "ethereum")
# time series joins update as source tables update
priceRatio = (
bitcoin.aj(ethereum, "Timestamp", "SizeEth = Size, PriceEth = Price")
.update("Ratio = Price / PriceEth")
.renameColumns("SizeBtc = Size")
)
# time-bin by minute and aggregate accordingly
agg = priceRatio.update("TimeBin = upperBin(Timestamp, MINUTE)").by(
["TimeBin"],
[
AggAvg("Ratio"),
AggMin("MinRatio = Ratio"),
AggMax("MaxRatio = Ratio"),
AggSum("Size", "SizeBtc"),
AggWAvg("SizeBtc", "VwapBtc = Price"),
],
)
import numpy as np
from sklearn.linear_model import LinearRegression
# write a custom function
def computeBeta(value1, value2):
stat1 = np.diff(np.array(value1), n=1).reshape(-1, 1)
stat2 = np.diff(np.array(value2), n=1).reshape(-1, 1)
reg = LinearRegression(fit_intercept=True)
reg.fit(value1, value2)
return reg.coef_[0][0]
# filter, sort and do time-series joins on source tables
iot = source.where("MeasureName = `Example`").view(
"TimeInterval", "DeviceId", "MeasureValue"
)
iot_joined = iot.aj(iot.where("DeviceId = `Master`"), "TimeInterval", "Measure_Master")
# use the custom function within the deephaven object directly
# no client-server or copy
betas = (
iot_joined.by("DeviceId")
.select(
"DeviceId",
"Beta = (double) computeBeta.call(Measure_Master.toArray(), MeasureValue.toArray())",
)
.sort("DeviceId")
)
- Java Client
- Python Client
- C++ Client
- JavaScript Client
FlightSession session = newSession();
TableSpec trades = readQst("trades.qst");
TableSpec quotes = readCsv("quotes.csv");
TableSpec topTenTrades = trades
.aj(quotes, "Timestamp", "Mid")
.updateView("Edge=abs(Price-Mid)")
.sortDescending("Edge")
.head(100);
try (
final Export export = session.export(topTenTrades);
final FlightStream flightStream = session.getStream(export)) {
while (flightStream.next()) {
System.out.println(flightStream.getRoot().contentToTSVString());
}
}
from pydeephaven import Session
from pyarrow import csv
session = Session() # assuming DH is running locally with the default config
table1 = session.import_table(csv.read_csv("data1.csv"))
table2 = session.import_table(csv.read_csv("data2.csv"))
joined_table = table1.join(
table2, keys=["key_col_1", "key_col_2"], columns_to_add=["data_col1"]
)
df = joined_table.snapshot().to_pandas()
print(df)
session.close()
auto client = Client::connect(server);
auto manager = client.getManager();
auto trades = manager.fetchTable("trades");
auto quotes = manager.readCsv("quotes.csv");
auto topTenTrades = trades
.aj(quotes, "Timestamp", "Mid")
.updateView("Edge=abs(Price-Mid)")
.sortDescending("Edge")
.head(100);
std::cout << topTenTrades.stream(true) << '\n';
class TableView {
setFilter() {
this._filters = Array.prototype.slice.apply(arguments);
return this._table.applyFilter(this._filters);
}
addFilter(filter) {
this._filters.push(filter);
return this._table.applyFilter(this._filters);
}
// Use cloning when you want to create a new table
// to apply filters without modifying the existing table.
clone(name) {
if (!name) {
name = `${this._name}Clone`;
}
return this._table.copy().then((newTable) => new TableView(name, newTable));
}
}
UI Tools
Open-source code studio for accelerated data exploration
Build with Deephaven
What can you build with Deephaven?
Scale up
Enterprise Deployment
Deephaven Enterprise has been battle-tested inside the demanding environment of hedge funds, stock exchanges and banks. Its collection of enterprise-ready tools and exclusive add-ons helps your team scale up quickly and benefit from the mutualization of enhancement requests. Professional services are available if you’d like more hands on deck.
Batteries included data management
Data Management
Systems for ingesting, storing and disseminating data focus on throughput and efficiency. Utilities exist to support cleaning, validation, and transformation. Sophisticated control systems limit user or team access to source and derived data, by directory and table; as well as granularly by row or column key.
Scale across 1000s of cores, PBs of data, and TBs of streams
Query & Compute
The Deephaven Enterprise platform comprises the machinery, operations, and workflows to develop and support applications and analytics at scale -- real-time and otherwise. It is readily deployed on commoditized cloud or physical Linux resources using modern techniques. Ingest, storage, and compute scale independently.
Create and share applications and interactive dashboards quickly
UI & Tooling
Deephaven Enterprise has premiere experiences in Jupyter, Excel, R-Studio and classic IDE’s and its REPL, but it also includes a zero-time UX for launching, scheduling, and monitoring applications. These feed dependent enterprise apps and empower the quick configuration and sharing of real-time dashboards.