Deephaven remains laser focused on making real-time data easy for everyone – on its own and coupled with static data.
The team continues to evolve the deephaven-core, barrage, and web-client-ui projects, releasing versions 0.7.0 and 0.8.0 of core, at the beginning and end of December, respectively.
To support Community, we have been working on documentation coverage and established a new deephaven-examples GitHub organization to serve as a central warehouse of illustrative use cases. We encourage the community to contribute examples there as well.
Further, the Deephaven YouTube channel continues to grow. Subscribe to view the new content we plan to drop each week.
The description below follows the organization of development themes presented in our 2022 Roadmap.
Highlights
- Removed log4j as a dependency.
- Added support for local arm64 builds.
- Delivered the Deephaven
learn
library, which will allow users to nicely marry Deephaven’s real-time table dynamics with Python AI models. - Created Docker images for Deephaven+PyTorch, +TensorFlow, and +SciKitLearn, respectively, for easy deployment for data scientists.
- Made available new models for deployment generally, ones particularly suited for local Python development.
- Added input tables to the API and web-UI.
- Meaningfully increased the performance of some aggregation cases (both updating and static).
End-of-January’s deliveries will center around:
- Engine performance improvements and measurement infrastructure.
- A new CSV reader (that we’re proud of).
- A new plugin infrastructure (supporting both server-side and JS entrypoints).
- A new Debezium integration (for CDC).
- A meaningfully re-architected server-side Python experience.
- A table-map implementation that will enable users to create and manage in-memory child tables based on keys of a parent table.
Full Release Summary
General ease of use
New deployment models
- Investments were made to avail users of new models for deployment. Docker will remain a fundamental option, and the Envoy proxy will continue to be important for many setups, but we wanted to open simpler models for running and scaling Deephaven and its web-UI, particularly for local development. To accomplish this, we helped modify gRPC-Java. With that work, we can use Jetty, a Java servlet container, to run the server and to serve the web-UI, which wasn’t possible using netty (as was the case pre-release). This allows you to run natively on Mac/Windows or without the indirection of Docker on Linux, simplifying debugging and integration with other local resources. Some related websocket work also makes Deephaven compatible with gRPC-web clients. #1731
Improvements to (nicely simple) data sourcing methods
Users want the sourcing of data to be easy. We do too, so we built a URI-driven method in a library called ResolveTools.
With simple syntax like
t = resolve(‘dh+plain://address/path/table_name')
, you can inherit real-time, dynamic tables from Deephaven applications, CSVs, Parquet files, and Barrage tables – both locally and from remote sources (including public domains). In the release, we addressed some wiring related to those capabilities. #1706
Table replay available (also) in Python
- We added the replay capabilities to the Table.Manipulation module, so users can easily configure a dynamic replayer of static tables. #1707
Improved memory-use monitoring
- The PerformanceQueries class provides process performance statistics to users. It is often used to analyze queries. We augmented the statistics it makes available. #1559
Query Engine
Improved aggregation infrastructure
We improved the performance of many aggregations in both the dynamic-update and static-batch cases. #1726
We added a new
aggAllBy()
method as catch-all infrastructure to support future aggregations that might be added to the engine. This method allows you to apply the same aggregation to all non-keyed columns. #1618
QST now supports input tables
- Input tables are a utility that make integrating custom sources of streaming and batch data into Deephaven very easy. In this release, we added support for input tables via the Query Syntax Tree, the declarative structure implementation available to users.
Python, ML, AI, and Data Science
Type casting inside Deephaven’s learn
module
- Deephaven's
learn
module provides utilities for efficient data transfer between Deephaven tables and Python objects, as well as a framework for using popular machine-learning / deep-learning libraries with Deephaven tables. In this release, we fixed casting, so the work product of ML libraries yields expected types. #1543
Other Python work
We fixed an issue whereby a recent refactor removed
where_one_of
functionality, so "OR" filters are back on track. #1650Wheels are now being used directly in the build process. #1555
The jpy configuration will now be implied by the Python environment. #1708
UI/UX and the JS API
UI-driven input experiences are now available in the web-UI
We updated the JS API to support gRPC input tables and delivered pretty slick user experiences around adding and changing data manually via the front end. If you pull up an input table in the UI, you can easily add rows and modify data therein. #1565
We updated node.js to latest 14.x release. #1565
Client APIs and the OpenAPI
- We fixed an issue so modified rows that are shifted are correctly accounted for. #1564.
Data Sources and Sinks
We added support for nested fields in Avro. #1667
Further, we fixed bugs in Avro options “mapping” and “mapping_only” in
kafka_consumer.py
. #1656
The Barrage Wire Protocol
- We added support for BigDecimal and BigInteger in the Barrage protocol. #1627
Further reading
To learn more about the documentionation changes for each release, see: