Skip to main content

A design comparison between Materialize and Deephaven

· 2 min read
DALL·E prompt: Two glossy server racks on an infinite flat surface, space movie Interstellar (2014)
Cristian Ferretti
Architecture fundamentals that matter for dynamic data

Deephaven's and Materialize's query engines both rely heavily on an incremental-update model to empower use cases. Despite this, some of their design fundamentals are quite distinct from one another. Perhaps this can be explained by disparate sets of use cases driving development.

MaterializeDeephaven
Update Cycle
  • Update cycle based on logical clock (typically transactions from a database source determine clock ticks)
  • Fixed time (wall clock) step update cycle
Consistency Model
  • Fully ordered, total consistency model similar to relational (transactional OLTP) database model
  • Consistency model based on partial ordering (stream partition, similar to Kafka)
BI Exploration & Dashboarding
  • Depends on third party software
  • Tight integration with own UI for streaming tables and easy data exploration, app-dev, and notebooks.
  • Integrations with Jupyter in development
Source Tables
  • In process
  • In process or through the network (Arrow Flight)
Client Library
  • Via ODBC
  • No support of code to data
  • Can use TAIL operator for reacting to changes
  • Language clients (Python, Java, C++) use DH as a library
  • API supporting a local view of a streaming table
  • Supports "code to data" via column operations
OLTP / OLAP Affinity
  • Close affinity with transactional systems / OLTP, application development
  • Row oriented
  • SQL, Postgres (psql) network protocol
  • CDC/database (transaction tagged) input sources central to consistency model
  • Close affinity with data science, OLAP, app-dev
  • Column oriented (even for delta based updates)
  • Arrow, Parquet, Python, Java
Sources
  • Kafka
  • Debezium
  • Amazon Kinesis
  • S3 buckets
  • PubNub
  • Files (Avro, Protobuf, CSV, JSON)
  • Direct Postgres Connection
  • Kafka
  • Debezium
  • Apache Parquet
  • Files (CSV)
  • Arrow Flight
  • Other Deephaven workers point to point (no broker)
  • ODBC in development

The architecture fundamentals inherent in the models framed above have implications for data transport and interoperability with other ecosystem tools and user applications. In a nearby post, we articulate a view that, when it comes to dynamic table data, providing a framework that complements the engine matters a lot.