Detailed Version Log Deephaven 1.20240517

Note

For information on changes to Deephaven Community, see the Github release page.

Certified versions

Certified VersionNotes
1.20240517.437The following caveats apply to this release:
  • The updated R/C++ client has not been tested, so use a version from a previous certified build.
  • Podman deployments are not supported in this version.
1.20240517.414The following caveats apply to this release:
  • The updated R/C++ client has not been tested, so use a version from a previous certified build.
  • Podman deployments are not supported in this version.
1.20240517.379The following caveats apply to this release:
  • The updated R/C++ client has not been tested, so use a version from a previous certified build.
  • Podman deployments are not supported in this version.
1.20240517.344
1.20240517.298
1.20240517.245The following caveats apply to this release:
  • The Kubernetes pushAll.sh script tries to push the non-existent deephaven_customer image when pushing the pre-built Deephaven images. To workaround, edit the pushAll.sh script and remove deephaven_customer from the loop on line 29.
  • DH-17681 is included, but system table buffers in the tailer incorrectly use the user buffer size configuration (DataContent.producerBufferSize.user), which defaults to 256KB. See tailer memory parameters for further details on adjusting these values.
1.20240517.222
1.20240517.189Core+ Kafka ingesters using consumeToDis incorrectly require the KafkaOffset column (or the column specified by the deephaven.offset.column.name) to be included in the target schema. If your target schema does not include the KafkaPartition or KafkaTimestamp columns, they must be disabled by setting the deephaven.partition.column.name and deephaven.timestamp.column.name properties to an empty string. This will be corrected in a point release including DH-17561.

Detailed Version Log: Deephaven v1.20240517

PatchDetails
437Merge updates from 1.20231218.524
  • DH-19122: Release note documentation fixes
  • DH-18792: Config server deadlock (fishlib push)
  • DH-18792: Config server deadlock
  • DH-18723: Input Tables cannot paste more rows than number of visible rows
  • DH-18622: Fix controller issue for started-then-deleted PQs
  • DH-17824: Fix docs for restart
  • DH-18954: updateBy ArrayIndexOutOfBoundsException
  • DH-18967: Do not use local -n in locally-run installer scripts
  • DH-17419: Show dashboard modifications with deephaven.ui changes
  • DH-17418: Fix dashboard major/minor notifications
  • DH-18442: Fix Export Logs Fails with Large Number of Queries
  • DH-18830: Update internal VM images to version 7
  • DH-18645: Fix XSS issue in file list drag and drop
  • Update UI packages to v0.78.9
  • DH-18798: Fix token cache growing indefinitely
  • DH-18101: Adding keepalive seconds for win boxes
  • DH-18125: close the LAS logger in DatabaseImpl.appendLiveTable
  • DH-18708: Change gRPC logging exclusion list separator from semicolon to comma
  • DH-18701: Update web packages to v0.78.8
  • DH-18645: Fix panel titles using html instead of just text
  • DH-18346: Fix partial holiday range breaks
  • DH-16016: Fix xBusinessTime throwing errors
  • DH-18176: Suppress scary "non-fatal" warnings; only upload missing files
  • DH-15878: Automatically upload etcd*tar.gz files to remote machines
  • DH-18422: Update generation of iris-endpoints.prop for Podman so Web ACL Editor will work correctly
  • DH-18345: Update USNYSE calendar with national day of mourning for Jimmy Carter
  • DH-18166: Avoid lock inversion in OneClickUtils (swing)
  • DH-17927: Backport DH-16434, automatic node being at the top of the list, to jackson
  • DH-18028: Fix ConnectorWrapper race conditions
  • DH-18468: Wire up kubernetes flag to jenkins for eggplant
  • DH-18519: Allow adding GrpcLogging exclusions via properties and env vars
  • DH-18510: Ensure the exclusions list class names for gRPC logging match inner classes as well
  • DH-18153: Fix bad substitution in installer script error handling function
  • DH-18426: Expose DHLog in global context to allow changing the log level via the browser console
  • DH-16191: Core+ Python Auth Context Methods
  • DH-16872: Fix Web not displaying PQ error restart count >10 correctly
  • DH-18329: Allow user calendars to override system calendars in Core+
  • DH-18187: Fix console history not sticking to bottom
  • DH-18071: Add test to support DeephavenUI dashboards from a code studio.
  • DH-18175: Modified Podman start_command.sh to support Podman on MacOS and to fix --nohup always being applied
  • DH-17696: Added A to start_command.sh dig call, to ensure IPv4 address is retrieved.
  • DH-17932: Change array handling and add label searches to dh_helm uninstall functions
  • DH-16189: Fix deephaven.ui panels when permissions change
  • DH-17936: Warn when DH_JAVA is set to an invalid path
  • DH-17798: Pin deephaven.ui version to 0.15.4
  • DH-16150: deephaven.ui in Enterprise
  • DH-17292: Fix tables opened with deephaven.ui throw error when disconnected
  • DH-17880: Change Podman start_command.sh default behavior to preserve existing properties
  • DH-17977: Add volume options to Podman start_command.sh for illumon.d/java_lib and illumon.d/calendars volumes
  • DH-17999: Fix coreplus_test_query.py nightly test
436DH-18956: Fix eggplant nightly test compilation
435DH-19110: Fix R client methods live_table and historical_table
434DH-18368: Closing a ui.dashboard in a code studio shouldn't prompt
433DH-18694: The gear icon on login screen appears to be broken in web with saml auth plugin
432DH-18279: Fix unstable one click plot titles in web
431DH-18893: Update deephaven-hash to 0.3.0
430DH-18778: Fix wrong Assert import used.
429DH-18727: Fix unbound variable in podman for envoy admin
DH-18729: Podman change image creation to be single-user
428DH-18930: handle SocketException, Connection reset better in LAS client
427DH-18988: Python Core+ auth client should log client UUID to facilitate troubleshooting
426DH-16352: Ability to disable password authentication in web UI
425DH-19061: Update web packages to v0.85.17
DH-18856: Support Median Aggregation in the UI
DH-18681: Fix Inability to delete Input Table row with null key
424DH-18778: Javadoc fix.
423DH-19041: Updates to the version log generation script
422DH-19052: Core+ test stage should not cause Jenkins pipeline failures.
421DH-18778: Cannot apply ACLs to Rollups or Trees
420DH-18756: Make sure that internal partitions do not have invalid characters.
419DH-19028: Update Grizzly to Core 0.38.0
DH-18351: Add CountWhere feature to UpdateBy Rolling and Cumulative
DH-18491: Implement optimized FirstBy / LastBy + NaturalJoin feature
DH-18414: SourcePartitionedTable: Allow deferred existence checking and add partitioning columns
DH-18856: Add Median to jsapi Aggregation Options
418DH-18989: Misconfiguring git repositories results in opaque controller crash loop and inability to login
417DH-18754: Web API Service should create workers off-thread, avoid blocking other calls
416DH-18815: Enable TCP KeepAlive for clients downstream of Envoy
415DH-18986: Fix bugs in Py Client's controller module
414DH-18941: NPE in local swing console log when creating a new PQ
413DH-18001: correct consumerBufferSize calculation
412DH-19007: Adjust formatting in Grizzly Release Notes to be Salmon-compatible
411DH-19009: Fix EggplantTask gradle wiring
410DH-19006: Disable DynamicTableWriter Integration test for now.
409DH-18682: Make InputTableUpdater a LivenessReferent, able to materialize input table views
408DH-18001: Tailer reduces buffer size to match DIS
DH-18751: report on tailer/DIS buffer configuration errors better
407DH-14692: Stop exporting client from support logs
406DH-18744: Add support for VPlus to Grizzly upgrade of Podman deployments
405DH-18838: port fishlib Client fix to Deephaven
404DH-18981: Update CODEOWNERS for Enterprise.
403DH-18669: Revert scoping changes from DH-18877
402DH-18946: Make dh_preinstall_check require/install rsync if needed
DH-16487: Validate installer host uses rsync>=3.1
401DH-18669: Improve handling of internal partition column
400DH-18877: UncoalescedBlinkTable must manage its source, Use LivenessScope.computeEnclosed in DatabaseImpl
399DH-18439: Introduce dh_preinstall_check.sh for early installation validation
398DH-18800: Reliability changes for automation and integration tests.
397DH-18867: Include the proto-wrappers dir in the C++ client source tarball
396DH-18754: Add missing JS client details for replica instances
395DH-18897: Check condition for PqStorageEtcdV2Impl delete is wrong
394DH-18794: Allow kubernetes flags to align input params to eggplant manager
393DH-18747: Per-worker pod cleanup delay
392DH-18794: Investigate bhs not contactable by eggplant
391DH-18709: Performance under load of Controller Query Tracker operations
390DH-18691: Automatically migrate sql->etcd acls
389DH-18620: Provide means to change timeout to get properties from etcd and increase default
388DH-18660: Scripts for load testing of Controller subscriptions and PQ restart, stop and modify requests
387DH-18741: Improved Error Message when Shortcuts Fail to Load
386DH-18732: Fix XSS issue in file list drag and drop
385DH-18596: Fix MultiScriptPathLoader#getAvailableScriptPaths caching
384DH-18569: Allow configuration of worker pod toleration on authenticated username
382Update web UI packages to v0.85.15
  • DH-18798: Fix token cache growing unbounded
381DH-18820: Fix jenkins publish docs script
380DH-18758: Fix C++ Client Docker build
379DH-18639: Python Core+ client ControllerClient API improvements
378DH-18722: Update Core+ to 0.37.6
DH-18539: Fix incorrect snapshot results on historical sorted rollups
DH-18632: JS ChartData should write updates to the correct row
377DH-18661: Update README for buildUtils/jenkins, delete dead code
376DH-18702: Update web version to v0.85.14
DH-18645: Fix panel titles using HTML instead of just text
375DH-18480: correction to table filter
374DH-18467: Automate qa server upgrades
373DH-18148: Add new Jenkins stage + publishing script for javadoc/pydoc uploads to docs site
372DH-18558: QA k8s clusters have a calendar problem with Swing UI
371DH-18612: Merge forward Podman functionality from VPlus.509
DH-18422: Update generation of iris-endpoints.prop for Podman so Web ACL Editor will work correctly
DH-18175: Modified Podman start_command.sh to support Podman on MacOS and to fix --nohup always being applied
DH-17696: Added A to start_command.sh dig call, to ensure IPv4 address is retrieved.
DH-17880: Change Podman start_command.sh default behavior to preserve existing properties
DH-17977: Add volume options to Podman start_command.sh for illumon.d/java_lib and illumon.d/calendars volumes
DH-17999: Fix coreplus_test_query.py nightly test
370DH-18303: Fix python sourcing of Test Automation script
369DH-18668: Update GoWorker
368DH-18595: Fix Kubernetes install affected by change to iris_keygen.sh inputs
367DH-900: Improve jenkins reporting build failures to slack
366DH-17398: Fix Pandas Dataframe Restart on Disconnect
365DH-18567: Core+ integration test for deleted system partitions
364DH-18616: Update Core+ to 0.37.5
DH-18486: JS clients must not release tickets for pending operations (#6618)
DH-17486: Support whole range of byte values for booleans (#6624)
DH-18588: Allow shortened Name.rootFIle config file specifier (#6621)
DH-18174: Delay reading from parquet file when creating table and column location (#6622)
DH-18567: Filter locations before removal allowed checks. (#6607) (#6614)
DH-18482: EventDrivenUpdateGraph: Allow two concurrent threads to safely requestRefresh(), and ensure clean errors for nested refresh attempts (#6603) (#6615)
DH-18300: Improve DataIndex performance. (#6585) (#6593)
363DH-18555: Fix controller_import pydoc
362DH-18614: Cherrypick start_command fixes for podman
361DH-18319: Add publication to jfrog for podman
360DH-18598: Update jenkins images to v6
359DH-18538: Fix Deephaven Express not respecting WebGL flag
358DH-18097: Add podman publishing to release pipeline
357DH-18097: Add support for building and publishing a Podman images tar.gz
DH-18314: Add Podman to CI build image
DH-18319: Fix CI build image README and clean up Dockerfile
356DH-18456: Make generation of protobuf descriptor sets ("protosets") part of proto gradle rules
355DH-18540: Fix changelog generation script link generation
354DH-18528: Core+ C++ Client: User documentation and examples
353DH-18526: Core+ C++ Client: Protobuf type pun generator adds more Doxygen comments
352DH-18527: Core+ C++ Client: Add/modify some more C++ documentation comments
351DH-16929: Core+ DataIndex should read RowSets from Index Codec columns directly
350DH-18362: Update org.jboss.resteasy to 5.0.9 to address vulnerability
349DH-18338: Core+ C++ Client: Remove Immer from public exports
348DH-18458: Core+ C++ Client: Upgrade Catch to v3.8.0
347DH-18278: Add merge KeyFormula option to Core+ Kafka schema helper
346Upgrade deephaven.ui to v0.23.3, deephaven-plugin-plotly-express to v0.12.1
  • DH-18058: Support ui 0.23.3 in Grizzly
  • DH-18282: Patch Grizzly's dh.ui error related to Core breaking change
  • DH-18404: Update deephaven-plugin-plotly-express to v0.12.1
  • DH-17342: Fix widget not found error with deephaven.ui
  • DH-17476: Fix dh.ui Code Studio crash
345DH-18459: Core+ R Client: Typo in demo script
344DH-18401: Update Core+ to 0.37.4
DH-18433: SourcePartitionedTable needs to check the size of pending locations even if there are no added or removed locations
343DH-18401: Update Core+ to 0.37.3
DH-18395: Prefer bulk-unmanage whenever LivenessManager.unmanage is called on multiple referents
DH-18385: correct OuterJoinTools.leftOuterJoin/fullOuterJoin() output when RHS initally empty
DH-18389: Chart subscriptions shouldn't reverse positions
342DH-18407: Core+ C++ Client: Add more doxygen comments
341DH-18406: Core+ C++ Client: Fix a couple typos in docker-build.sh
340Upgrade Web UI to v0.85.12
  • DH-18387: Backport "Thousands" format option to Grizzly
339DH-18340: Add ControllerClientGrpc to javadoc
338DH-18318: DndQueryInitializer must mark script execution thread as systemic
337DH-18337: C++ Core+ Client: GetUserAndPassword() should use std::cin::getline()
336DH-18032: Remove unnecessary dependency on numpy in Core+ Py client
335DH-18372: Update Core+ to 0.37.2
DH-18373: Fix jpy python search paths
DH-18331: JS API viewports should throw only for basic data issues
DH-18178: Ensure that DataIndexes produced by a RegionedColumnSourceManager are retained by the DataIndexer
DH-18194: Fios jsapi type errors from 0.37.1
DH-18364: Add Property to Disable Core DataIndex
334DH-18103: Fix typings in slow login cancel button
333DH-18103: Show a Cancel button on slow login
332DH-17675: Fix Web not displaying PQ error restart count >10 correctly
DH-18325: Fix web UI incorrectly sets error restart count to 11 instead of infinite (-1)
331DH-18310: Make getPQ methods public in DnDSessionFactoryBase
330DH-18301: Add some Doxygen comments to the C++ Core+ Client
329DH-18302: Improve Core+ C++ client proto wrapper generator for better autogen doc
328DH-17853: Additional certificates fixes found while running test plan
DH-17666: Split container tests and nightly integration tests into separate tasks
327DH-10177: correct typo in readme
326DH-17853: Run validate_certificates.sh during install, add bugfixes and tests for iris_keygen.sh
325DH-18211: Use correct offset in snapshot
324DH-18202: Enable Systemic object tracking by default in Core+
323DH-18066: Update Core+ to 0.37.1
322DH-18184: Fix input tables pasting more rows than visible
321DH-18182: Python Core+ RemoteTableBuilder blink method calls the wrong Java RemoteTableBuilder method
320DH-18162: Fix issues where OnDiskDeephavenTableLocation would incorrectly say there was an available Data Index
319DH-18127: Add clearLocationCache() methods to Core+ db.
318DH-18151: Fix Web UI breaking on replicated WebClientData with non-null ReplicaSlot
317DH-18148: Split Enterprise Py Docs into separate ones for worker and client
316DH-18018: Fix Web UI breaking on replicated WebClientData with duplicate serials
315DH-18118: Restore idempotency of user acl in TestAutomation install
314DH-18096: Ignore version mismatch in DataGen test for re-install
313Merge updates from 1.20231218.491
  • DH-18025: Add missing gradle inputs for web dependencies
  • DH-18075: Disable certificate-validation script
  • DH-18054: Improve validate_certificates.sh script for older OSes
  • DH-18035: Remove local - from installer scripts
  • DH-17852: Add validation that truststore contains desired certs; ensure all web cert intermediates in new truststore
  • DH-17822: Use ubuntu instead of centos for installer tests
  • DH-17890: Fix issue where PQ crashing outside schedule skips next start
  • DH-17822: Update iris-defaults.prop to use python3.8 by default
  • DH-17921: Update GWT-RPC to avoid websocket reuse bug
  • DH-17757: make csv import sensitive to CopyTable schemas
  • DH-17822: fix python setuptools, remove python 3.6 and 3.7
  • DH-17995: Pull back filesystem validation in etcdctl.sh
  • DH-17990: Lock inversion deadlock in WorkerLeaseHandler
  • DH-17974: WouldMatch memo key is incorrect
  • DH-17952: Improve Merge DataIndexer consumeTable performance
  • DH-18002: Move QA SAML Instructions into repo to sync with releases
  • DH-18004: Add explicit dependency from coreplus client to numpy to track upstream dependencies
  • DH-18003: Pull back username-as-group fix from DH-17754
  • Changelog fix.
  • DH-17093: Discard failed promises in CompilerTools.
  • DH-17951: Make InternalDeployer stop using username for group in chown
  • DH-17949: Backport DH-17481 Core+ Python SystemTableLogger codec support
312DH-17834: Update QA results input table support to core+ python
311DH-17968: Make python notebook import WorkspaceDataSnapshot aware
310DH-17989: Improve Legacy ResolveTools usage
309DH-18040: Fix Panels Menu Error in Safari
308DH-18019: Fix test backup script test file location
307DH-18022: Prefer Deephaven admin user account to cp files in config_packager.sh
306DH-18020: Update QA Results data loads to include junit
305DH-18016: Fix merge/validate queries on QA cluster
304DH-17987: Add dhconfig schema tests WRT missing namespaces
303DH-17980: QA - add new dhconfig dis test coverage
302DH-17982: Add dhconfig props tests for alias names
301DH-17954: Fix Query Monitor Throws Error for Set Difference
300DH-17832: Upgrade Legacy Arrow jars to 17.0.0
299DH-17935: Put quotes around --exclude="java*,jna*"
298DH-17962: Fix Panels Menu Status not Updating
297DH-17920: Expose challenge-response login through JS API
296DH-17948: Python API upload_key does not work with keys generated with generate_keypair
295DH-17945: Add infrastructure for autogenerating Protobuf documentation
294DH-17905: Fix remaining ubuntu20-infra kafka test failure
293Merge updates from 1.20231218.478
  • DH-17933: Fix java 8 compilation issues in JpyInit
  • DH-17873: add --nohup option to Podman start_command.sh
  • DH-17886: add option to Podman start_command.sh to mount /db/IntradayUser volume
  • DH-17885: add option to Podman start_command.sh to mount /db/Users volume
  • DH-17835: remove writability check of volume directories in Podman start_command.sh
  • DH-17903: ClassCastException reading parquet file in Legacy
  • DH-17928: Fix QueryScheduler token warning
  • DH-17929: Fix extra character in TestDefinition
  • DH-17915: Fix legacy barrage subscriptions for rows with empty object arrays of non-Object type
  • DH-17824: Fix podman redeployments when logs are stored on a volume
  • DH-17902: QA DNS name utility enhancement
  • DH-17791: Modified configurations lose creation time
  • DH-17909: Increase performance overview test wait time
  • DH-17901: Enable legacy python to lookup location of libpython.so
  • DH-17499: Fix several dh_helm problems and improve usability when used with values.yaml
  • DH-17894: Updates to QA DNS name utility
  • DH-17887: deephaven_enterprise.remote_table should return a python deephaven.table.Table object
  • DH-17883: Relocate QA DNS Utility to more appropriate location
  • DH-17811: Eggplant SUT setup - cleanup final cmds.
  • DH-17864: Fix missing tests on integration runs
  • DH-17589: Fix summary table on qa-results
  • DH-17811: Setup scripts for new SUT boxes to use for Eggplant tests
  • DH-17849: Set eggplant VM size in correct location
  • DH-17635: Create utility to manage virtual names for QA Test results servers
  • DH-17601: Setup auditable dashboards for junit tests on qa-results
  • DH-17830: Stop pip from attempting to check PyPI during container initialization
  • DH-17707: CART does not schedule reconnection on some failures.
  • DH-17717: Check for Connection in com.illumon.iris.db.util.config.TableInputHandler#getTableRaw
  • DH-17744: Remove setuptools.extern from legacy python
  • DH-17111: Add better handling for known error case
  • DH-17626: Add Eggplant nightly jenkins job
  • DH-17718: Add atexit handler to shutdown workers rapidly
  • DH-17430: Handle trailing metadata to produce better error messages in python client
  • DH-16939: More error message improvements
  • DH-17770: Installer jar needs to be republished to io.deephaven.enterprise
  • DH-17795: Fixed passing script text as script name to classloader in Core+
  • DH-17697: Support volume for /var/log/deephaven and custom volumes in podman deployment
  • DH-17687: Allow incremental include filters in TestAutomation runs
  • DH-17589: Fix summary table on qa-results
292DH-17857: mac developers cannot deploy branch due to --transform argument sent to tar
291DH-17762: Add grouping and symbolTable options to Core+ Kafka schema helper
290DH-17754: Do not allow installer to use username as a groupname
289DH-17934: Fix DBAclServiceProviderTest failures
288DH-17917: Improved worker pod lifecycle management
287DH-17765: Fix spares sorting order in the status indicator in QM
286DH-17912: better checking for duplicates in 'dhconfig properties import'
285DH-17930: TimePartitionRotationTest fails because Date is not within current window
284DH-17882: report on invalid namespaces in 'dhconfig schema list/export'
283DH-17847: Release note for config changes needed in k8s environment
282DH-17859, DH-17860, DH-17862: Improve performance overview overnight analysis and error logging
281DH-17633: Update podman deployment for 1.20240517 (Grizzly)
280DH-17916: Legacy Python Session Fails to Start on Kubernetes
279DH-14183: Fix Panels Menu Memory Issue
278DH-10177: make sure 'dhconfig properties delete' won't delete a file that 'sanitizes' to the same name
277DH-17906: Run Helm Upgrade as part of Minikube Nightly Test
276DH-17905: Add jq for Kafka nightly tests
275DH-17616: Reduce duplicates in completed binlog cleanup logs
274DH-17888: Forward merge test automation to align qa-results
273DH-17847: Config changes for easier future kubernetes upgrades
DH-17872: Fix broken Core+ merge workers on Kubernetes
272DH-15651: Web Support for Custom Favicon and Browser Tab Title
271DH-17603: Cron tasks adding junit-test readin to qa-results
270DH-17801: Fix typo in iris_keygen.sh
269DH-17416: Prevent Console Connect while Client Disconnected
268DH-10177: add 'dhconfig properties delete'
DH-17733: 'dhconfig dis' improvements to handling invalid names
DH-17827: dhconfig schema export NPE when exporting single non-existant schema
267DH-17826: Add no-undef eslint rule to catch missing imports
266DH-17616: Log completed binlog filenames to for easier maintenance
265DH-17813: Generate C++ protos as part of Core+ C++ client build
264DH-17730: Fix Incorrect Date Formatting on Partitioned Tables
263DH-14183: Fix Panels Menu React Spectrum Bug
262DH-17792: Tailer should use direct buffers.
DH-17786: Correct system-tables incorrectly using user allocation properties in tailer.
261DH-17743: Avoid crashing Dispatcher by hung worker
260DH-17537: Fix Advanced Filters broken on tree tables
259DH-17809: Fix exception in the browser console on non-running query updates
258DH-14183: Fix Panel Menu Performance with Many PQs
257DH-17808: Update QA-HOWTO command with sudo
256DH-17749: Add Unit tests for Iceberg Integration
255DH-17773: Cleanup cmake warnings in Core+ cpp-client
254DH-17672: Changed ACL Editor behavior when Managed User Authentication is enabled
DH-17721: Changed auth_server_reload_tool to verify ECDSA keys
253DH-17177: Update deprecated JSAPI calls to use .designated
252DH-17723, DH-17747: Rationalize NameResolver configuration related to authority overrides
251DH-17748: Fix project coordinates for republishing
250DH-17764: Fix draft de-selection when not in the viewport
249DH-17751: Push of deephaven_customer docker image to repo not required
248DH-17771: Updates to core+ C++ and R readme files
247DH-17760: KafkaTableWriter NPE with entirely filtered data
DH-17761: Unused ZoneId in TimePartitionRotation
246DH-17320: Added support for almalinux (alma8 and alma9 in cluster.cnf)
245DH-17748: Remove jdk8 from republishing
244DH-17598: Add Iceberg table support to Core+ historical tables.
243DH-17744: Remove setuptools.extern from legacy python
242DH-17582: Replica selection for UI panels
241DH-17254: Adjust iris start script for k8s environments
240DH-17681: Release note update.
239DH-17719: Fix R client after Windows changes broke it.
238Merge updates from 1.20231218.446
  • DH-17688: Fix PQ imports for eggplant
  • DH-17664: Disable some inconsistent controller tests
  • DH-17657: Fix default DH_ETCD_USER value in dh_users script
  • DH-17654: PersistentQueryConfigTableFactory per-client tables must override satisfied.
  • DH-17030: Add single-server, non-root accounts, and Envoy support to podman deployments
  • DH-17638: Fixed WebClientData query Reconnects to Controller using incorrect UserContext
  • DH-17630: Update Core+ to 0.35.2 (superseded by 0.36.1)
  • DH-17622: DeferredACLTable must copy filters (cherry-pick)
  • DH-17609: pushAll.sh should allow a source tag
  • DH-17557: Include db_query and db_merge images.
  • Release note formatting fix.
  • DH-17634: Fixed Web API Server Reconnects to Controller using Incorrect Context
  • DH-17623: Core+ Performance Overview has Bad Error Message on V+
  • DH-17604: Allow int-tests to setup gwt tests
  • DH-17559: Republish should capture Installation Media (tar files)
  • DH-17608: Tighten permissions for java plugins
237DH-17712: Changes to C++ and R Core+ clients for Core+ Excel Add In
236DH-17700: Fixed a race condition delivering PQ status updates during startup swaps
235DH-17660: Web Schedule Disable Overnight if Stop Time Disabled
234DH-17558: Re-enable merge/validate queries for the ResourceUtilization table
DH-17619: Improve qa-results setup instructions
233DH-17681: Tailer should use pool for system tables
232DH-17652: Setup default integration-test to use AutoQuery
231DH-17595: Included input validation in SchemaService methods
230DH-16398: Fixed client details in Audit log for Connection Disconnect Events
229DH-17674: Allow specifying SAML redirect URLs separately
228DH-17456, DH-17432: Add tools for managing DHE schema from Core+ Kafka workers
227DH-17673: Ensure coreplus generate_scheduling generates pydocs
226DH-17620: Update Core+ to 0.36.1
225DH-17648: Include deephaven-extensions-json-jackson Jar in Core+ Build
224DH-17309: Add major version upgrade test
223DH-17454: Fix PQ Editor display in Code Studio
222DH-17638: Fixed WebClientData query Reconnects to Controller using incorrect UserContext
221DH-17634: Fixed Web API Server Reconnects to Controller using Incorrect Context (cherry-pick from V+)
220DH-17422: Fix deephaven.ui panels not refreshing on permissions change
219DH-17622: DeferredACLTable must copy filters
DH-17623: Cherry-pick Core+ Performance Overview has Bad Error Message
218DH-17591: Programmatic core+ python scheduler array creation
217Apply spotless
216Merge updates from 1.20231218.432
  • DH-15896: Update build instructions for a qa-results system based on testing of Junit ticket
  • DH-17443: prohibit --password from being given more than once
  • DH-17496: Additional fixes to writing vectors for Core+ support
  • DH-17496: Fix writing Vectors to User tables and reading parquet arrays in legacy workers
  • DH-17518: Fix dependent scheduling stop-time restart issue
  • DH-17322: Restrict appendCentral by ACL group membership
  • DH-17408: pause tailer connections
  • DH-17467: Missed path on handling *-OLD directories
  • DH-17583: Replace a stray jcenter() with mavenCentral() in gradle
  • DH-17557: Build and upload container images to GS Buckets
  • DH-15624: Correct tolerations applied to Envoy.
  • DH-17505: Allow data managers to command DIS truncate
  • DH-17568: Fix typos in cluster monitoring queries
  • DH-17550: EggplantIntTestSetup should not pass --prodTests flag
  • DH-17539: Do not use sudo with -g flag to invoke chgrp
  • DH-17551: Qa results metrics add new release
  • DH-17540: Update merge/validate queries for Test Automation
  • DH-17435: Improve installer test robustness/feedback
  • DH-17542: Fix test results to handle non-zero exit status
  • DH-17541: Update test results server build instructions
  • DH-15896: Track unit tests more accurately
  • DH-15624: Add support for tolerations, selectors, and affinity in Helm chart
  • DH-17435: Improve installer test robustness/feedback
  • DH-17542: Fix test results to handle non-zero exit status
  • DH-17541: Update test results server build instructions
  • DH-15896: Track unit tests more accurately
  • DH-15624: Add support for tolerations, selectors, and affinity in Helm chart
  • DH-17483: Fix run counter logic on qa-results
  • DH-17343: Make installing from infra node as irisadmin work (plus test)
  • DH-14499: Containerized deployment with podman
  • DH-17120: Add qualified references to etcdctl in installer scripts
  • DH-16353: Ability to disable password authentication in front-end (swing)
  • DH-17498: Fix for dhconfig NPE introduced in 406.
  • DH-16827: Parameterized Queries listen for OneClick events (swing)
  • DH-17541: Check diskspace before unpacking large tar files
  • DH-17055: dhctl checks for disabled tailer ports when scanning
  • DH-17443: remove auth options from dhconfig checkpoint
  • DH-17498: remove duplicate status and garbage logging from dhctl
  • DH-17506: Do not treat a default INSTALLER_ROUTE value as a user override value
  • DH-17504: Fix disabled context menu items for superusers in Query Monitor
  • DH-17493: Fix controller_tool test 11 for all supported java
  • DH-17485: Fix Web Temp Schedule
  • DH-17373: Add DH_NODE_N_INSTALLER_ROUTE for installs from bastion
  • DH-16001: Enforce logDirectory with zoneId in Core+ SystemTableLogger builder
  • DH-17120: Add DH_DIR_ETCD_BIN to control where etcd binaries are found
  • DH-17232: Do not call require_owner if DH_SSH_USER not set
  • DH-17272: Make V+ Core+ Python Client Compatible with non-Envoy Grizzly
  • DH-17463: Update Core+ to 0.33.6
  • DH-17445: Allow config Property to override ServiceRegistry hostname
  • DH-17056: Allow Endpoint config to override ServiceRegistry hostname
  • DH-17279: Add options to disable WebGL
  • DH-17420: Fix error with context menu filter on TreeTables
  • DH-17414: Dispatcher should log cancellation reason
  • DH-17395: Fixed an issue reading old parquet files with improper dictionary offsets. Fixed an issue reading nulls in INT96 encoding
  • DH-17400: Use --verbose flags when installation scripts invoke dhconfig
  • DH-17353: Remove centos test coverage
  • DH-17413: Fix bad string substitution when ssh keys have -v in them
  • DH-17288: Fix Exception When Importing a Jackson Query
  • DH-17353: Deprecate centos7, remove centos nightly tests
  • DH-17378: Fix monit log file location on rocky/rhel OS
  • DH-17372: Fix a bug in internal capacities of UpdateBy
  • DH-17377: avoid location subscriptions in closeAndDeleteCentral
  • DH-16495: handle reference counts while processing pending request snapshots
  • DH-17317: Updates to jackson gen loggers test
  • DH-17291: initialize BasicTableEventHandlerFactory earlier
215DH-17555: Additional fix for build testing
214DH-17617: Forward-merge of Test Automation changes
213DH-17133: Fix Concurrent Modification on Kill WebClientData
212DH-17538: Fix merge/validate queries for testing
211DH-17584: Allow KafkaTableWriter Transform to Skip Rows
DH-17605: KafkaTableWriter transformation can swallow errors
210DH-17590: Fix Chart Builder
209DH-17555: Automate testing for DH16283
208DH-17383: Add dependent scheduler unit tests around the stop time
207DH-17585: Remove incorrect columns from ResourceUtilization validator
206DH-17255: Scheduling error restart delay should apply initialization start times as delay
205DH-17572: Edit release notes for clarity.
204DH-17512: ServerSelectionProvider "Administratively Down" command
DH-17513: ServerSelectionProvider should back-off on servers that fail to acquire worker.
DH-17536: Controller should reload Configuration after leader Election
203DH-17561: KafkaTableWriter.consumeToDis not ignoring default Kafka columns
202DH-17520: dhconfig schema list should return non-zero exit code when no schemas are found
201DH-17501: Fix broken integration tests
200DH-17410: Fix issues with skip-if-unsuccessful scheduling flag
199DH-17527: Active Directory Group Synchronization
198DH-17533: Fix misleading logging on MySQL dhconfig acl deletions
197DH-16937: Fix Data Routing Configuration Errors to be less Verbose
196DH-17501: check-deephaven-cluster should validate each authentication server individually
195DH-16283: handle failure to create AEL better
194DH-16398: Fixed Audit Log Formatting for Client events
193DH-17071: Fix schema service error reporting for addSchema method call
192DH-17477: Minikube test needs additional logging
191DH-17508: dhconfig pq status ignores the --replica flag for name based filtering
190DH-17409: Normalize ACL columns string order and duplicates
189Update deephaven.ui version to 0.19.0
  • DH-17411: Fix disappearing panel on query restart
  • DH-17422: Fix needing to refresh dashboard to see contents after PQ start
188DH-17393: Update MatPlotLib plugin to v0.5.0
187DH-16506: Fix repeat-scheduling issue
186DH-17503: Update Grizzly to Core 0.35.2
185DH-17504: Fix disabled context menu items for superusers in Query Monitor
184DH-17481: Fix for pydoc
183DH-17438: Core+ Python Client should properly handle gRPC UNAVAILABLE
182DH-17487: Create dhconfig tests for migrated controller_tool functionality
181DH-17477: Minikube test needs additional logging
180DH-17455: Add property to optionally specify addresses to bind to for xDS in config server
179DH-17481: Add Python support for custom codecs in SystemTableLogger
178DH-17436: Fix Range Schedule Modified Time Zone
177DH-17442: correct return code for command line errors
176DH-17473: Add --verbose flags to locations where the installer invokes dhconfig/iris
175DH-17475: Updates to Release Notes
174DH-17472: Fix paths in DhcInDhe/R/rdnd/docker-build.sh
173DH-17442: reduce exception logging
172DH-17468: Add timeout option to check-deephaven-cluster script and use it in tests
171DH-17367: Deprecate Legacy ObjectCodec and use Core ObjectCodec directly. Full support for custom codecs in SystemTableLogger
170DH-17436: Cleanup Web Range Schedule UI
169DH-17464: Update Core to 0.35.1
168DH-17447: Fix more dhconfig obscure acl error messages
167DH-17457: Remove plaintext password options from dhconfig
166DH-17462: Remove explicit transitive private dependencies of Core C++ client from Core+ C++ client cmakefiles
165DH-16180: Fix QM overlay z-index
164DH-17449: Fix Export exporting multiple copies of the same query
163DH-17436: Fix Web Range Schedule Save Bug
162DH-17436: Web UI for Range Schedule
161DH-17448: Display a warning on query-only operations when selection contains replicas
160DH-17192: Add dhconfig pq leader tool
159DH-17180: Restart Selection should only restart Selected Replica
158DH-16646: Synchronize Multiple Query Monitors
157DH-16506: Add range query scheduling
156Merge updates from 1.20231218.385
  • DH-17164: Fix JsTreeTable Fails when Same Filter is Applied Twice
  • DH-17407: Fix Temurin repo setup for RHEL/Rocky
  • DH-16346: Fix Validate Settings Tab for View Only Mode
  • DH-17369: Convert qa-results scripts to corePlus and python
155DH-17434: Fixed race condition on Controller worker recovery and slot reassignment
154DH-17083: Import LivenessScope only under worker context
153DH-17080: Revert workaround to avoid skipping second controller
152DH-16011: Fixed unit test failures
151DH-17165: Add unit test prop
150DH-16011, DH-17271: More usability fixes for dhconfig acl
149DH-17337: Controller should swap in spares when user requests individual slot restarts
148DH-17165: Exclude input_table_snapshotter_tasks.txt from CUS digest
147DH-17402: Fix the load balancing tab not showing on drafts
146DH-17250: Add backup script option, fix possible race condition in backup process
145DH-16180: Add LoadingOverlay on query selection change in Query Monitor
144DH-17402: Fix the load balancing tab not displaying when it should
143DH-17405: Updates to Grizzly release notes
142DH-17083: Allow convenient barrage session functionality from a Core+ worker
141DH-16654: Added more resilient behavior for taking over service registry leases
140DH-17399: Ensure nightly build boxes are not deleted when tests fail
139DH-17315: Allow Restart Delay to be Blank with Validation
138DH-17312: Update Grizzly Core to 0.35.0
137DH-17385: Update Core+ plugins for Grizzly
136DH-17391: Test fixes.
135DH-16531: documentation updates
134DH-17244: Fix Core+ user table nightly int test failures
133DH-17248: Allow Dashboards to Export with Warning for Missing Queries
132DH-17379: Add helm chart unit tests for k8s development
131DH-12786: Move Remaining Web UI Tests from Enzyme to Testing Library
130DH-17310: Display Version on Safe Mode
129DH-17287: Python support for remote_table blink and filter
128DH-17376: Update Core+ cpp-client docker build script help
127DH-17321: Support multiple CORS origins for ACL Editor
126Merge updates from 1.20231218.381
  • DH-17359: Fixed random Test failures noticed for csv Custom Setters
125DH-17348: Update gRPC (1.61.0 -> 1.61.1) and netty (2.0.62 -> 2.0.65)
124DH-16011: Lots of quality of life fixes for dhconfig acl
123Merge updates from 1.20231218.380
  • DH-17356: Core+ Logger not handling parameters appropriately
  • DH-16737: Fix package-lock.json file that was erroneously generated
  • DH-16346: Fix Query Monitor Right Click Menu for View Only Query
  • DH-17357: Core+ workers should listen on all interfaces in Bare Metal
  • DH-17347: Core+ kafka ingester NPE with transformation
  • DH-17334: Cherry pick CART improvements from 1.20211129.422
122DH-17203: Cluster Validation Client Script bug fixes
121DH-17365: Grizzly Minikube Install Test Fix
120DH-17285: Sort list of servers within Controller
119DH-17364: Remove unused constant in PersistentQueryControllerClient
118DH-16343: Check column partition exclusivity on set not get.
117DH-17362: Update Web Version to v0.85.0
DH-17199: Fix filter by value in the tree table context menu always showing null
DH-17095: Fix unable to edit cells if key columns are not first column
116DH-17250: Add support for etcd auto-backup config and fix backup script on k8s
115DH-17361: corrections to merge from vplus to grizzly
114DH-17271: Test property file fix.
113DH-17112: Fix for Core+ worker launch without intermediate script.
112DH-17271: Fix for installation.
111DH-12786: Move Some Web UI Tests from Enzyme to Testing Library
110DH-17271: Add more safety around importing of public keys
109Fix eslint and web unit tests
108Spotless application.
107Merge updates from 1.20231218.374
  • DH-17303: Add primitive and String array support to Core+ SystemTableLogger
  • DH-17327: 'dhconfig dis export' handles empty set better
  • DH-17238: combine nested table filters in table data services
  • DH-17332: Fix for QA meta results bug in .362
  • DH-16987: Fix client-only etcd update scripts
  • DH-17295: Make rhel8 nightly tests less flaky
  • DH-17273: Manual changes after forward-merge
  • DH-17289: Put back testSerial and testParallel in main jdk8 build
  • DH-17188: CART does not detect reconnection if source is empty
  • Version log typos.
  • DH-17212: Remove PULL_CNF from jenkins menu
  • DH-17172: Controller Connection Memory Leak
  • DH-17221: AggDistinct previous values error
  • DH-17172: Controller Connection Memory Leak
  • DH-17162: Use dh-mirror for internal VM images
  • DH-14625: add release note
  • DH-14625: create an optional lenient IOJobImpl to avoid write queue overflow (improvement)
  • DH-14625: create an optional lenient IOJobImpl to avoid write queue overflow
  • DH-16604: Add controller memory stats to performance metrics
  • DH-16737: Reconnect deephaven.ui widgets upon PQ restart
  • DH-16738: Report errors in deephaven.ui correctly to user
  • DH-17311: Update Core+ to 0.33.5
  • DH-17318: Update VPlus gen loggers test for env
  • DH-17314: QA Documentation only update path corrections
  • DH-16987: Prefer etcd config files from /etc/sysconfig/deephaven/etcd/client over etcd tar
  • DH-17081: Fix Pandas widgets from Core+ workers in dashboards
  • DH-17299: allow configuration server to start without routing file
  • DH-17170: Qa results - move testEvalStats to a 2col, 3 row table
  • DH-16394: Fix Query Summary Out of Sync
  • DH-16172: Show Engine, Community Port, and k8s information in Safe Mode
  • DH-17168: qa-results refactoring of audit metrics
  • DH-16504: ConstructSnapshot and PUT do not consistently handle Instant
  • DH-17154: QA meta results refactoring
  • DH-17281: Fix Padding for Dashboard Shortcut Titles
  • DH-16854: If Login Cancelled after Auth then Log Out
  • DH-17280: Make eggplant-api.sh properly update existing test case, fix installer tests
  • DH-16876: Fixed csv_import utility not respecting proper default or explicit SAFE flag
  • DH-16346: For View Only Query, hide the Save, Copy, and Delete Buttons
  • DH-17268: Correctly pad zeroes for JS datetime format
  • DH-17262: Better support for input and output cluster.cnf as separate files
  • DH-16747: Add eggplant gradle task and jenkinsfile
  • DH-16129: Update Instances of community language to Core+ in UI
  • DH-17264: Ensure cron on qa-results does not repeat unneccessary elements
  • DH-17098: Fix package-lock.json for Jupyter-Grid
  • DH-17236: Backport DH-16790 Controller test improvements
106DH-16435: Improve validation of configuration type and replica and spare usage
105DH-17112: Alternate launch option for Core+ workers
104DH-17125: Swing should not be able to exit PQs with more than 1 replicas or more than 0 spares
103DH-17215: Fix unselectable Failed and AquiringWorker replicas
DH-16180: Show spinner while updating selected query details
102DH-17325: Update Core+ C++ Client Protobuf; add session manager example for connecting to existing PQ
101DH-17319: Exclude rules in shadow-jetcd must filter inner classes
100DH-17287, DH-17286: Support blink tables with RemoteTableBuilder, Support pre-subscription filtering
099DH-17249: Python Client Cannot Handle too Many Truststore Entries
098DH-17304: Implement campaign and rest of the etcd Election API client side
097DH-17307: check-deephaven-cluster integration tests fail without "python" on path
096DH-17306: NPE in Controller without ServerSelectionProvider
095DH-17305: Fix MySQL ACLs on Grizzly
094DH-17146: Fix fullStackTrace not displayed on replicas
093DH-17294: Fix npm start
092DH-17233: SimpleServerSelectionProvider prefers replicas on distinct hosts
091DH-17251: Fix Copyright on Web UI
090DH-17247: Add aria-label to Picker in Load Balancing Tab
089DH-17284: Fix improper delayed command queueing in the Controller
088DH-17282: Fix DerivedTableWriter partition selection
087DH-17259: Fix temporary Core+ Python venv dependency leak
086DH-17202: A controller taking over as leader calls observe in a a tight loop for a while and then fails
085DH-17261, DH-17256, DH-17227: Fix RRAsignmentPolicy not logging reset row when clearing state, corrupting binary log file.
084DH-17263: ECC Key Problems
083DH-16610: Add user facing cols to Core+ DB partitioned tables
082DH-17265: Fix Replica and spare APIs for web QueryInfo
081DH-17133: Constrain useWebClientDataHandleConfigChange to WebClientdata events
080DH-17141: Remove expired tokens in authentication server
079DH-17133: PQ Status display recover from controller failover
078DH-16050: Fix pending session not cleaning up on dashboard close
077DH-17181: Generate separate certificate for controller
076DH-17214: Controller should eagerly swap non-ready replicas for running spares
075DH-17203: Cluster Validation Client Script
074Merge updates from 1.20231218.344
  • DH-17243: Script for rebumping changelog.
  • DH-17162: Use dh-mirror for internal VM images
  • DH-17219: Fix how installer handles comments in cluster.cnf.
  • DH-17240: Reduce cluster.cnf parser warnings
  • DH-16766: Capture segmented results from nightly tests
  • DH-17229: Fix inaccuracies in filtering test cases
  • DH-17063: quiet dhconfig output when configuration server is down (logging npe)
073DH-17117 DH-17182: Complete AssignmentPolicy API / UI
072DH-17162: Use dh-mirror for internal VM images
071Revert DH-17141: Remove expired tokens in authentication server
070DH-17166: Support ECDSA Challenge Response Authentication
069DH-17141: Remove expired tokens in authentication server
068DH-17231: Improve ElectionCandidate etcd auth bug workarounds
067DH-17194: PQ Status for Replica Parent Row
066DH-17223: Fix Web UI not building correctly
065DH-17095: Cannot edit input table cells if key columns not first
064DH-17222: Cannot restart failed replicas
063Merge updates from 1.20231218.337
  • DH-16425: Vplus Feb-June 2024 test case updates for QA
  • DH-17138: fix pseudo subscription errors
  • Changelog typo corrections.
  • DH-17211: Fix erroneous Core+ hist part table data discovery
  • DH-17197: Fix failing DeploymentGeneratorTest
  • DH-17076: Update Web to 0.78.1, fix LayoutHint groups on TreeTables
062DH-17217: Controller Timeout Configuration Unit Mismatch
061Changelog correction.
060DH-17179: Add Deadline for Worker Restoration on Controller Failover
059DH-17197: Fix broken DeploymentGeneratorTest
058DH-17167: ACL Editor - recover from WebClientData restart
057DH-17129: Upgraded React Spectrum to 3.35.1
DH-17129: Upgraded DHC packages to 0.81.2
056DH-17175: Update Core+ to 0.34.3
055Fix merge conflict.
054Merge updates from 1.20231218.332
  • DH-17145: Remove unused CUS and RTA installer roles and stop tracking ROLE_COUNT
053DH-17193: DeephavenNameResolver should not block the gRPC Threads
DH-17139: Improve state restoration logging and parallelism.
052DH-17189: Fix Query Monitor not loading
051DH-17125, DH-17126: Prevent swing from editing PQs with replicas, make RoundRobin smarter about rebalancing
050DH-17178: Improve Core+ user table API return values
049DH-15582: UI For Selecting Python Packages and VEnvs on Community Workers
048DH-17040: Update RadioGroup and ButtonGroup references
047DH-17173: Minikube Install Test for Cert Manager
046DH-17151: Handle upgrades from versions before nfs reduction
045DH-17135: Controller needs to be more proactive checking worker states on startup
044DH-17133: Refetch query config table on WebClientData restart
043DH-16828: Remove extra kubernetes-only worker-overhead clause from iris-defaults.prop
042DH-17116: Upgrade node to lts 20.13.1 and npm to 10.5.2
041Merge updates from 1.20231218.330
  • DH-17157: Code Studio cannot set Kubernetes Container Image
  • DH-17160: Auth server must set authenticatedContext after successful external auth
040DH-17136: Use container-specific JVM args on Kubernetes
039DH-17150: Kubernetes iris-environment.prop Broken in Forward Merge
038DH-17142: Fix NPE in QueryInfo.close()
037Merge updates from 1.20231218.328
  • DH-17137: Authentication Server not cleaning up all client state when client sessions expire
  • DH-17131: Dependencies must be built to Java 8 API, not just bytecode
  • DH-17104: Ensure worker overhead properties are applied by default for kubernetes
036DH-17119: Improve error handling in Controller during worker startup and recovery
035DH-16963: Use npm ci for stable builds
034DH-17130: Make republishing include jdk8 source repo for fishlib dependencies
033Merge updates from 1.20231218.325
  • DH-17077: Make DeploymentGeneratorTest pass again
  • DH-17118: Improve cluster.cnf parsing logic
032DH-17127: Remove core proc default resources limits on Kubernetes
031DH-16548, DH-17084: Add Legacy to Core+ input table converter
030DH-17132: Add another case of exceptions where ElectionCandidate needs to retry
029DH-17130: Make republishing target jdk11 now that jdk8 is dropped
028DH-17118: Improve cluster.cnf parsing logic
Merge updates from 1.20231218.323 and 1.20230511.505
DH-16829: Update worker overhead properties
DH-17072: Do not write temporary-state DH_ properties to cluster.cnf
DH-17026: Publish EngineTestUtils (backport of DH-15687)
DH-17058: Make pull_cnf disregard java version
DH-16884: Add configuration for default user table format to be parquet
DH-17048: Fix controller crash and shutdown issues
DH-17014: Make cluster.cnf preserve original source and prefer environment variable
DH-17118: Improve cluster.cnf parsing logic
DH-17066: Apply Kubernetes Control to Legacy workers in ConsoleCreator
027DH-16606: Use JIT compiler limit for PQs by default
026Changelog and release note updates.
025Merge updates from 1.20231218.322
  • DH-17066: Apply Kubernetes Control to Legacy workers in ConsoleCreator
  • DH-17113: Fix permissions on test support files
024Merge updates from 1.20231218.320
  • DH-17070: AuthenticateByPublicKey misses state when different servers are involved
  • DH-17101: Update protobuf gradle plugin
023DH-16436: PQ Replicas UI
022DH-16436: Fix for uninitialized replicas.
021DH-17099: Default Web Landing Redirect to Web UI
020DH-17054: Add Assignment Policy setter to QueryInfo
019DH-17091: Integrate SAML into Core Product
018Merge updates from 1.20231218.318
  • DH-16557: Fixed DHC CSV Import not working with gzip files
  • DH-16955: Fix rollup rows and moved columns hydration
  • DH-16983: Test Automation - push git scripts to all controller nodes
  • DH-17098: Update package-lock in jupyter-grid
  • DH-17087: Minor test system documentation update for V+
  • DH-17086: Fix Test Automation README
017DH-17096: remove usage of <ButtonOld> deprecated component
016DH-17097: Fix docker build with pip cache disabled
015DH-16436: Webapi changes for Replica frontend
014DH-17088: Split deephaven_base, Add Argument to Allow apt/pip Caching in Minikube
013DH-16977: dhconfig pq should work as a "regular" user
012Merge updates from 1.20231218.312
  • DH-17074: Controller Tool Status Should Use a Static Table
  • DH-14265: Make new PR check cancel any still-running PR check
  • DH-16255: Fix incorrect log message in python setup script
  • DH-17063: quiet dhconfig output when configuration server is down
011DH-17078: PersistentQueryStateLog old format listeners do not handle ControllerHost
010DH-16915: Migrate filesytem data from /etc/sysconfig to Kubernetes objects
009DH-16143: Update GWT-RPC to avoid websocket reuse bug
008DH-17012: dhconfig dis list should have an option to show the claimed namespaces
007DH-17027, DH-16789: Fix PQs displaying error on clean shutdown and controller chattyness on disconnected clients
006DH-17062: validate dhconfig actions against for the selected data type instead of globally
005Fix merge issue in 004.
004Merge updates from 1.20231218.308
  • DH-17057: add support for remote DataImportServiceConfig.asTableDataServiceConfig
  • DH-17049: Allow disabling password authentication
  • Update web version 0.78.0
  • DH-17051: Fix partition selector not showing more than 1000 options
  • DH-17052: Do not show "Delete Selected Rows" for input tables without columns
003DH-17064: Update Grizzly to Core 0.34.2
002DH-17060: Fix internal VM image deployment task
001Initial release creation

Detailed Release Candidate Version Log: Deephaven v1.20231219beta

PatchDetails
161Apply spotless
160Merge updates from 1.20230511.503
  • DH-17047: Fix up merge from 20221001.356
  • DH-17045: Address Test Merge issue in Vermilion
  • DH-17011: Forward Merge of promoted tests from Jackson and promotions in Vermilion
  • Backport DH-16948: Always use same grpcio version for Core+ Python proto stub building
  • Merge updates from 1.20221001.356
  • DH-17031: Minor corrections and formatting for QA automation How-to
  • DH-16936: make recreating schemas watch more efficient
  • DH-16717: Add heap usage logging to web api, TDCP, DIS, LAS, controller, and configuration server
  • DH-17004: change closeAndDeleteCentral to clean up tdcp subscriptions
  • DH-17000: Correct improper test promotion in Jackson
  • DH-16888: Preserve original cluster.cnf when regenerating cluster.cnf w/ defaults
  • DH-16599: Bard Mar 2024 test case updates for qa
  • DH-16986: Update for flaky results from merge test starting at Bard
  • DH-16887: Fix test for DH-11284 starting at Bard
  • DH-16797: Change git location on QA testing systems
  • DH-16996: Forward merge of tests fixed in Bard to Jackson
  • DH-16992: Promoting Jackson level tests to RELEASED
  • DH-16979: Fix for CSV tests Jackson and later
  • DH-16663: remove cached data when there are no active subscriptions
  • DH-16934: Fix permissions check for writing workspace data
  • DH-16908: Fix dry run in iris_keygen.sh
  • DH-16851: Improve qa results setup docs
  • DH-16826: Select/Deselect All for OneClick Lists in Export dialog (swing)
  • DH-15247: Set DH_ETCD_IMPORT_NODE default value to the first config server
  • DH-16675: Account for worker overhead in dispatcher memory utilization
  • DH-16702: Vermilion April2024 test case updates for qa
  • DH-16958: Backport DH-16868 - Check if shadow package already added before adding again
  • DH-16875: Fix CSV import tests
  • DH-16873: Update and correct "testScript" section of automated QA tests
  • DH-16716: Parameterized logging broken in vermilion
  • DH-16847: Update and correct Dnd testing scripts
  • DH-16836: Fix forward merge anomaly
  • DH-16813: QA testing git update to Jackson
  • DH-16818: QA Testing System file relocation and documentation updates
  • DH-16072: Jackson Dec2023 test case updates for qa
  • DH-16480: Documentation and support for QA_Results system build
  • DH-16794: better handle export of nonexistent routing file
  • DH-16762: Fix C# docfx task (need to pin older version)
  • DH-16584: Make internal installer use correct sudo when invoking iris_db_user_mod
  • DH-16586: Improve qa cluster cleanup script
  • DH-16640: fixes for tests failing on bard and later revisions
  • DH-16708: Improve import script on qa results
  • DH-16698: Update BHS images to fix a broken rhel8 test
  • DH-16752: Fix installer tests getting null clustername
  • DH-16605: Use grep before sudo sed to avoid root when migrating monit
  • DH-16406: Improve jackson nightly installer test debugability
  • DH-16718: Fix test cases based on CommonTestFunctions refactor
  • DH-16706: ColumnsToRowsTransform.columnsToRows fillChunk does not set output size
  • DH-16700: Ensure QA results setup is maintainable
  • DH-16750: Fix temporary and auto-delete scheduling checks
  • DH-16542: CUS should trim client_update_service.host - fix for Envoy
  • DH-15013: Fix upload CSV from Web UI for timestamp columns
159Merge updates from 1.20231218.305
  • Java 8 build fix and changelog fixes.
  • DH-17035: Ensure BUILD_URL from jenkins is populated in Test Automation results
  • DH-17032: Deep linking can cause the wrong dashboard to open after logout
  • DH-17042: Forward-merge Test Automation
158DH-16912: Handle overflow in timeout calculation.
157DH-16912: Add Option to Request/Limit Additional Memory
156DH-11204: remove db_ltds from default installation
155Merge updates from 1.20231218.301
  • DH-17033: Combine JS API table ops on login to improve speed
154DH-17046: Controller failover/failback issues
153Merge updates from 1.20231218.300
  • DH-16978: Additional fixes for multiple auth servers
  • DH-17029: handle removed locations in the LTDS
  • DH-16866: Improve Test Automation to target cluster
  • DH-17023: Added "target version" parameter to update-dh-packages script
152DH-17013: ReplayDatabase and notebook imports, other cleanup.
151DH-17025: Grizzly with Cert Manager causes Invalid Hostname for Verification
150DH-10086: Update GWT to 2.9.0
149Merge updates from 1.20231218.296
  • DH-17017: Skip staging tests on Feature Branch runs in jenkins
  • DH-16143: Update GWT-RPC to avoid websocket reuse bug
  • DH-16642: Web UI should allow a second QM
  • DH-16164: dhconfig schema import -d does not handle symlinks properly
  • DH-16933: DH-16778: Fix dashboard export saving extra dashboards and queries
148DH-17019: remove unused kv-tool-cli
147DH-17009: Pass Schema through Core+ tables
146DH-15831: Change default buffer allocation to non-direct in tailer
145DH-16869: Make installer properly filter the list of machines running controllers
144Merge updates from 1.20231218.292
  • DH-16995: Plotly express does not work in Deephaven UI
  • DH-16997: Make internal installer detach install scripts from java process group to avoid getting killed on failures
  • DH-16950: Prevent ChunkerCompleter.resolveScopeType from getting into an infinite recursive loop and crashing
143DH-16994: Internal Mac Docker Kubernetes Instructions
DH-16998: minikube LAS is broken
142DH-16661: Update Maystreet integration
141Merge updates from 1.20231218.289
  • DH-16988: Ensure nightly test VM names are unique, and other test stability improvements
140Merge updates from 1.20231218.288
  • DH-16658: Hive layouts should return an empty table if the table base location does not exist
  • DH-16941: MergeParametersBuilder should have a default value for threadPoolSize
  • DH-16926: Fix test generation error on multi-PQs
  • DH-16914: Update DHC packages to ^0.77.0
  • DH-16914: ACL Editor crashes with error: No item found matching key: 0
139DH-16972: Add Minikube Installation Test
138DH-16961: Use ACL Store instead of dsakeys.txt
137DH-16790: Make PersistentQueryControllerTest more reliable
136DH-16981: Non-displayable queries shown in dhconfig pq status
DH-16982: Controller host should be in PersistentQueryStateLog
135Compile fix.
134Merge updates from 1.20231218.284
  • DH-16976: Fix java 11 compile from 283
133DH-16977: dhconfig pq should work as a "regular" user
132Spotless fix.
131Merge updates from 1.20231218.283
  • DH-16976: Fixed Core+ out of bounds errors when trying to unbounded fill
  • DH-16916: Pin Spectrum Dependencies for @adobe/react-spectrum 3.33.1
  • DH-16916: ACL Editor: unable to scroll the Namespace dropdown
  • DH-16978: Multiple auth server private-key validation failures
  • DH-16970: Ensure EXCLUDE filter in Test Automation is honoured on kafka
130DH-16978: Multiple auth server private-key validation failures
129DH-16975: Controller client determines wrong controller
128DH-16974: dhconfig pq is too noisy
127Merge updates from 1.20231218.279
  • DH-16969: Allow RemoteTableBuilder to work with clusters behind envoy
  • DH-16971: Make internal installer clear failed systemd units so systemctl is-system-running works
  • DH-16907: Allow test automation with no FeedOS schemas
  • DH-16913, DH-16962: Make all nightly tests pass, and run stably
  • DH-16965: correct error message when LAS is not available
  • DH-16544: Bug fixes for dh_helm
  • DH-16890, DH-16779: fix java version on nightly tests, use internal java repositories
126DH-16967: Fix wrong parameter in groovy session wrapper
125DH-16946: Remove Deprecated In-Worker Service PQ
124DH-16545: Add Python wrapper for Core+ input tables
123DH-16909: Update Core to 0.34.1
122DH-16919: Fix missing file that resulted in CUS crash.
121DH-16011: Correct tests for ACL tool, allow --ignore-existing for public keys.
120DH-16959: Reduce bcryptcost in etcd configuration
119DH-16954: Use /usr/illumon/coreplus instead of /usr/illumon/dnd
118Merge updates from 1.20231218.272
  • DH-16953: Put the version back into rpm package names
  • DH-16921: Fix DashboardOverride rewriting without changes
  • Update Web Version 0.76.0
  • DH-16924: Fix laggy notebooks in Web UI
  • DH-14825: Java 8 compilation fix.
  • DH-16948: Always use same grpcio version for Core+ Python proto stub building
  • DH-16944: Add Cross Cluster test to Grizzly QA
  • DH-16925: Snapshot locations break multi-level pages for parquet regions
117DH-16011: Implement Acl tool within dhconfig
116DH-16919: Remove /db/TempFiles NFS Volume
115DH-16547: Add Core+ input table snapshotter
114Merge updates from 1.20231218.265
  • DH-16910: Adjust Kubernetes heap overhead parameters
  • DH-16923: make claims filter consistently accept user tables
  • DH-15984: better handle export of nonexistent routing file
  • DH-16907: Update FeedOS schemas for ticking source
  • DH-14825: CUS should ensure served files are accessed
  • DH-15824: Fix cluster.cnf backup commands
  • DH-16862: Core+ does not properly convert between Legacy and Core NULL_CHAR
  • DH-16883: Upgrade should import the new status-dashboard-defaults.json file
  • DH-16889: Fixed an NPE in ungroup with nulls in array native array columns
  • DH-16898: Fix configuration for high cpu tests
  • DH-16890: Fix imports in AbstractDeploymentTest.groovy
  • DH-16811: Config to support nightlies in test automation
  • DH-16868: Check if shadow package already added before adding again
  • DH-16189: Update deephaven.ui and plotly plugins
  • DH-16189: Fix re-hydration of deephaven.ui plugins in dashboards
  • plotly-express v0.7.0: https://github.com/deephaven/deephaven-plugins/releases/tag/plotly-express-v0.7.0
  • deephaven.ui v0.13.1: https://github.com/deephaven/deephaven-plugins/compare/ui-v0.8.0...ui-v0.13.1
  • DH-16720: Support deephaven.ui dashboards from PQs
113DH-16930: Fix issues with timeout handling in ResolverReader, DispatcherClient and ElectionCandidateImpl
112DH-16799: Allow customer jars and plugins to be built into images for K8s deployments
111DH-16837: Cleanup classpaths.gradle
110DH-16871: Preexisting PVCs Do not Work with pre-release Hook
109DH-16861: CommunityPersistentQueryHandle can leak channels when machine under load
108DH-16856: Package worker client directory for query servers
DH-16857: Suppress grpc exception in configuration server
DH-16860: Fix phrasing in example envoy yaml files
107DH-16837: Port fishlib IO to iris repo
106DH-16858: Delete etcd_prop_file in k8s installation.
105DH-16858: Delete etcd_prop_file
104DH-16855: Enforce DH-16852 limits of 1GB (admin) and 2GB (all users).
103DH-16824: Add some Parallelism to Core+ Startup and other Improvements
102Merge updates from 1.20231218.250
  • DH-16852: Do not permit scheduling a worker with a heap more than available memory.
  • DH-14975: Make DH_JAVA the ultimate source of truth for "where to find java"
  • DH-15824: Backup previous /etc/sysconfig/deephaven/cluster.cnf whenever upgrading
  • DH-16420: More versatile configuration of status dashboard query monitoring
  • DH-16850: Fix Kubernetes installation issues
101DH-15578: Build Python wheel with Core+ Worker Extension
100Merge updates from 1.20231218.247
  • DH-16832: Test built in Community Code should not be run in Java8
  • DH-16842: Use Parameter instead of QueryTracker Config in Dispatcher Usage Update
  • DH-16821: Pull qa-results improvements forward to vplus
099DH-16843: Improve Core+ user table int tests and checkInputTableKind
098Fix merge error.
097Merge updates from 1.20231218.244
  • DH-16838: DELETEs handled incorrectly in Presence KV Monitor
  • DH-16822: ReplicatedTable doesn't handle all possible long backed time sources
  • DH-16544: dh_helm fixes and enhancements
  • DH-16835: Expose WorkerHandle through PersistentQueryHandle as well as connections.
  • DH-16224: Refresh ACL data when switching to Import, Merge, or Validate tabs
  • DH-16823: Controller client should not print scary error on graceful shutdowns
  • DH-16773: Web version bump to v0.72.0
096DH-16820: Move dispatcher presence monitoring from job assignment to job scheduling
095DH-16819: Update jpy to 0.15.0
094DH-16118: Make new dhconfig pq less noisy
093Merge updates from 1.20231218.237
  • DH-16816: Failure to Cancel PQ Can Resuilt in Controller Crash
  • DH-16783: Fix ChartBuilder in Web UI
  • DH-16804: Update deephaven.io version log generation script.
  • DH-14610: Use domain names to send files to etcd server machines
  • DH-15749: Etcd server IP address should be configurable to support multiple network interfaces
  • DH-14859: Never leave world-readable etcd config tars on disk
  • DH-16776: Fix errors when sorting symbol tables with mixed nulls
  • DH-16791: SystemTableLogger Checker is Timing Out
092DH-16801: Reduce Number of etcd Client Creations
091DH-16810: Remove Legacy C# Open API Client
090DH-16792: Properties changes in k8s to fix swing
089DH-14125: Fix bug with late publishing of details.
088Merge updates from 1.20231218.231
  • DH-16787: PresenceWatcher is started under lock
  • DH-16693: Run core+ integration tests during nightly installer testing
  • DH-16767: Core+ exec_notebook broken in .213
087DH-16785: Allow the controller to register its address in resolver as a host name instead of just as IP
086Spotless
085Fix javadoc compilation.
084DH-16118: Migrated PersistentQueryControllerTool into dhconfig
083DH-16780: Print Status Details from Python Core+ Client
082DH-16770: Drop Python 3.8 from Grizzly
081DH-16772: Combine PR title and Changelog Check
080DH-16774: Revert spotless upgrade until we support gradle 8.X
079DH-14125: Detailed worker startup logging on k8s
078DH-16771: Support community-only PQ configuration types
077DH-16769: Spotless should remove unused imports.
076DH-16765: Add User Assignment Table to WebClientData
075Merge updates from 1.20231218.228
  • DH-16633: Rebuild VM images with etcd 3.5.12 instead of 3.5.5
074DH-16152: Replay Query for Core+
073DH-16633: Rebuild VM images with etcd 3.5.12 instead of 3.5.5
072Merge updates from 1.20231218.227
  • DH-16805: Fix C# docfx task (need to pin older version)
  • DH-16655: Make internal installer replace certs that expire in 2 months or less
071DH-16756: Python Method for Generating and Uploading Keys to Deephaven Server
070DH-16742: Update ci images to rocky9
069DH-16317: Installer jars should all target jdk8
068Merge updates from 1.20231218.225
  • DH-16605: Use grep before sudo sed to avoid root when migrating monit
  • DH-16721: Core+ Python Client Should Reauthenticate to Controller
  • DH-16740: Share JS API cache between deferred loader and app
  • DH-15994: Fixed Core+ DictionaryRegionHelper incorrectly accounting null values
  • DH-16689: Core+ worker cannot read direct DbArray Columns
  • DH-14774: correct syntax error in update_workspace.py, update installer version
  • DH-16731: Republish coreplus java jars, and always use jdk11 for republishing
  • DH-16719: SAML Login From Core+ Python Client.
  • DH-16695: Support io.StringIO as a private key in Core+ Python Client.
067DH-16743: Better parallelization of gradle build
066DH-16730: Ensure override authority works for ManagedChannelFactory via resolver
065fix previous merge
064Merge updates from 1.20231218.217
  • DH-16729: unbox primitive types even when specified as java.lang.Type in schema
  • DH-16728: correct error message diagnosing invalid listener
  • Release note fixes.
  • Fix Unit test failure dur to expanded assertTableEquals checks
  • Fix forward merge conflict in Core+.
  • DH-16469: Bard Feb 2024 test case updates for qa
  • DH-16569: Backport DH-15882 to fix Pandas data frame view bug
  • DH-16149: Improve npm build caching in CI
  • DH-11512: handle '*-OLD' directories better
  • DH-16672: EmptyToNullStringRegionedColumnSource bypasses index narrowing in grouping
  • DH-16623: Unit test fix from .321
  • DH-16623: Index and GroupingBuilder .hasGrouping() should only look at locations relevant to the desired index
  • DH-16624: ShiftedColumns Interacts with Time Parsing
  • DH-16628: whereIn/whereNotIn with Empty Set Tables can Fail
  • DH-16597: check for routing to export before opening output file
  • DH-16591: Fix reading Parquet files with Mixed dictionaries and Offset Indices
  • DH-16443: Add sudo -u DH_MONIT_USER for installer when checking if monitrc needs migration
  • DH-16408: Do not use yum on systems with dnf
  • DH-15523: Allow config_packager to run as irisadmin when irisadmin is monit user
  • DH-14156: improve merge query and dhctl feedback when tailer ports are disabled
  • DH-14169: Fix message when purge fails
  • DH-16363: Remove kubectl from VM base images
  • DH-16442: Make ubuntu monit de-rooting use DH_MONIT_USER instead of DH_ADMIN_USER
  • DH-16113: Bard Jan 2024 test case updates for qa
  • DH-16451: upgrade npm to latest lts/fermium version
  • DH-16450: avoid a deadlock due to lock inversion
  • DH-16053: correct minor errors in DataImportChannel
  • DH-16367: Make INTERNAL_PKI=true work correctly on mac
  • DH-16354: Make INTERNAL_PKI=true cert expiry limits configurable
  • DH-15467: Change superfluous gitlab url into github url
  • DH-16107: NPE in whereIn Error Handling
  • DH-16347: add synchronization to getGroup... methods in AbstractDeferredGroupingColumnSource
  • DH-16499: improve feedback in 'dhconfig routing export' when there is no routing file in etcd
  • DH-15729: Allow resources to be skipped in Test Automation
  • DH-16443: Make ubuntu de-rooting grep on monitrc before trying to sed the file
  • DH-16468: Vermilion Feb 2024 test case updates for qa
  • DH-16669: Schema with functionPartitionInput=true generates a broken logger
  • DH-16622: Address inconsistencies in automated tests for DDR
  • DH-16632: Updated controller_tool tests support file locations and stability for vermilion and following
  • DH-16592: Find healthy etcd node for etcd snapshot
  • DH-16534: Importing Jackson ACLs to Vermilion or later fails because SystemACLs are not recognized
  • DH-16612: Avro Kafka Ingestor error with extra consumer fields
  • DH-16580: Bad Origin Causes NPE in Auth Server
  • DH-15070: Make proto re-builds check for "use shadow package" before altering source
  • DH-16542: CUS should trim client_update_service.host
  • DH-16678: Add vermin check to Core+.
  • DH-16705: Add meta import machinery for controller.
  • DH-16709: Provide Mechanism to Refresh Controller Scripts without Git Configured
  • DH-16710: git repository state is incorrectly serialized
063DH-16435: Back end support for PQ replicas and spares.
062DH-16496: make TDCP (and other table data services in general) responsive to data routing changes
061DH-16357: Cache ACLs and add watcher.
060Merge updates from 1.20231218.212
  • DH-16703: Update Vermilion+ to 0.33.3
  • DH-16687: Add etcd ACL encoding tool
059Merge updates from 1.20231218.210
  • DH-16670: FeedOS test support from Bard to VPlus
  • DH-16668: Refactor controller_tool tests to wait for logging to be done.
  • DH-16686: Update Vermilion+ to 0.33.2
  • DH-16634: Fix dashboards migration issue
  • DH-16626: Support deephaven.ui dashboards from a code studio
058DH-16683: Port fishlib base stats to iris
057DH-16343: Utilities for reliable creation of live derived data
DH-16651: improvements to the Core+ Python Controller client
056DH-16684: Fix swing trying to resolve using etcd
055DH-16674: Remove fishlib hash dependency
054DH-16531: make DIS filtering responsive to routing changes
053DH-16673: Do manual auth retries in ElectionCandidateImpl
052Merge updates from 1.20231218.205
  • DH-16621: Expose available query objects as a table to users
  • DH-16664: Fix Core+ cpp-client dockerized build after incompatible changes on DHC 0.33
  • DH-16656: Fix listener reachability in TableMapTest, added integration test for DH16656
051DH-16268: Port Net and Stats from fishlib and eliminate AnomalyClient dependency
050Merge updates from 1.20231218.202
  • DH-16656: ResolveTools sets empty columns on snapshot
  • DH-16659: tailer handles data routing impl that does not support change notification
  • DH-16652: Update automated tests on controller_tool for VPLus
  • DH-16644: Update copyright year in web launcher page
  • DH-16637: Fixed Core+ .toUri() stat'ng directories during discovery
049DH-16641: Configuration server error due to missing controller.port property on k8s
048DH-16587: Converge install and upgrade hook scripts in helm/Kubernetes
047DH-16643: Update Grizzly to Core 0.33.2
046DH-16629: Replace jetcd campaign workarounds with fix in jetcd implementation
045DH-16639: Use Resolver to create connection.json in Internal Worker Python Client
044Merge updates from 1.20231218.197
  • DH-16462: Add profile JIT CPU options
  • DH-16582: upgrade etcd from 3.5.5 to 3.5.12
  • DH-16616: Fix Safe Mode in Web UI
  • DH-16617: Fix line plots in Web UI
  • DH-16589: automated validation test for import driven kafka lastBy DIS
  • DH-16593: Fixed Legacy CART trying to reconnect even after good data was received.
  • DH-16189: Enable deephaven.ui widgets from PQs
  • DH-16575: Core+ Python Client Wheel Should be Usable in Worker VEnv
  • DH-16530: Loosen Core+ Client Version Requirements
  • DH-16492: Fix javadoc
  • DH-16564: Package jupyter logged_session in iris repo
  • DH-16500: Update deephaven-plotly-express plugin to 0.5.0, update Web UI to v0.67.0
043DH-16577: ConfigurationUtils.getGradleVersion() returns "undetermined" on Core+
042DH-16615: Ensure dh-resolver changes get propagated via config server xDS
041DH-16619: Installer fails with missing controller property
040DH-16607: Installer to support more than one controller host
039DH-16434: Use auto-server-selection by default
DH-16526: Use better auto-group dispatcher heap default if none is provided
038DH-16431: Remove Java 8 from Grizzly build.
037DH-16603: Force dh-resolver registrations to use IP address literals, avoid depending on another round of DNS
036Merge updates from 1.20231218.185
  • Jdk8 Compilation Fix.
  • DH-16290: correct initial install condition
  • DH-16290: 'dhconfig routing' validate and import must consider existing extra dises
  • DH-15984: improve 'dhconfig routing export' feedback when there is no routing file
  • DH-16492: create local cached DataRoutingService, use it in the tailer
  • DH-16249: Use correct API for widgets
  • DH-16364: If etcd is setup but not working correctly, fail the install instead of generating a new etcd cluster
  • DH-16579: URL encode groups in removeMembership for MySQL ACLs
  • DH-16537: Fix partition_by failing to render the table
  • DH-15918: correct unit test
  • DH-15918: tailer restarts on routing change
  • DH-16148: create listener framework for data routing service
  • DH-16144: add writers group for data routing service writers
  • DH-16543: Add missing WorkspaceData data types, and all-types file, to backup_deephaven script
  • DH-16554: Update Web UI to v0.66.1
  • DH-16554: Upgrade React Spectrum to ^3.34.1
  • DH-16554: Removed some ACL Editor css classes
  • DH-16383: Remove all passwords from logs, automated test of no passwords in logs
  • DH-16483: Fix WindowCheck entry combination bug.
  • DH-16370: Update Core to version 0.33.1
  • DH-16533: Fix dispatcher error response failure conditions
  • DH-16551: Link to Enterprise Javadoc from Core+ Javadoc
  • DH-16563: dhconfig dis add should mention --force when the dis already exists
  • DH-16556: allow export of core dises
035DH-16376: Store Etcd client creds in secrets and mount only in relevant pods in k8s
034DH-16590: Fix java 11 compilation error in ElectionCandidateImpl
033DH-16433: Controller Leader Election
032DH-16574: Prevent a dependency from configuration server to start on controller having started
031DH-16572: Provide a way to override the default dh-resolver via properties
030DH-16560: Support controller clients resolving without etcd
029DH-14838: Add Core+ input tables
028DH-16558: Fix ResolverReader javadoc
027Merge updates from 1.20231218.165
  • DH-16535: Fix Persistence KVs not being cleaned up properly
  • DH-16220: Add DBNameValidator to namespace and tablename ACL inputs fields
  • DH-16529: SBOM coreplus artifact shouldn't use dnd in its name
  • DH-16524: Fix WorkerKind JSON generation from controller request
  • DH-16337: Update delete intraday data label to match swing
  • DH-16483: Fix javadoc build failure
  • DH-15771: Fixes for dh_helm script
  • DH-16508: Integration test update rocky compatibility
  • DH-16493: Make core+ builds leverage gradle task caching
  • DH-16483: Reduce WindowCheck memory usage. #1398
  • DH-16417: Make manifest.json visible in k8s environments
  • DH-16282: Fixed CI build to fail on jest / junit errors not just failures
  • DH-16282: ACL Editor - Table ACLs error when clicking "Update ACL" that will become "Add ACL"
  • DH-16494: Fix Swing ACL Editor requesting ACLs for null NS or TN
  • Release note updates.
  • DH-16479: Integration tests added for core+ kafka transformations
  • DH-16489: Integration test for Python Core+ table groups.
  • DH-16488: Update Core to 0.32.1
  • DH-16489: Core+ Python ACL Transformer not unwrapping Tables
  • DH-16438: Add time to installer dependencies / rocky VM images
026DH-16558: Make ResolverReader more tolerant of registration races
025DH-16473: Implement Etcd based gRPC resolver and switch the controller service to it
024DH-16511: Use round_robin load balancer policy for authentication server
023DH-16510: Use round_robin load balancer policy for etcd clients.
022Merge updates from 1.20231218.146
  • DH-16475: Integration test fixes
  • DH-16015: Vermilion Dec 2023 test case updates for qa
  • DH-15598: Additional schema validation fixes
  • DH-15598: Add merge validate pqs for new tables
  • DH-16387: Fix R setup in test automation from forward-merge
  • DH-16275: Fix test automation anomalies
  • DH-16418: Fix DiskBackedDeferredGroupingProvider changing post-mutator "No groups found" to "No grouping exists"
  • DH-16382: Perform monit migration using systemd override.conf
  • DH-16206: Remove duplicated gen-keys.sh script in jackson
  • DH-16401: Fix Groovy script defined classes imported with db.importClass() break internal formulas
  • DH-14283: DeephavenNullLoggerImpl should use dynamic pool
  • DH-16237: change user buffer caching to restore backpressure
  • DH-14938: Properly cache downloadDocFx task, to reduce build flakiness
  • DH-16291: Add tags to test with no data and address one breaking test for Bard
  • DH-16273: backport DH-14452 to fix logging error
  • DH-15740: Test certificate fingerprints so we always update certs when they change
  • DH-16262: Wrap calls from groovy to gsutil inside bash -ic
  • DH-16252: Update USNYSE Business Calendar to Include 2026
  • DH-16242: CART Leaks Connections when Snapshots are Slow, Exception can escape in refresh()
  • DH-16309: EmptyToNullStringRegionedColumnSource should copy and wrap underlying provider by default
  • DH-16309: Fixed loss of grouping when SourceTable.MAP_EMPTY_STRINGS_TO_NULL == true
  • DH-16300: Test Automation: have minorversion flow to results summary
  • DH-16279: Add MessageListener example implementation to SBEStandAlone jar
  • DH-16262: Wrap calls from groovy to gsutil inside bash -ic
  • DH-16415: Fix a race in GrpcLogging initialization.
  • DH-16041: Move installer tests to jdk17
  • DH-16205: Remove nightly core+ tests (vermilion only)
  • DH-16328: Add release notes for DH-11713
021Fix bad merge resolution.
020Merge updates from 1.20231218.144
  • DH-16472: NPE in PQWorkerServiceGrpcImpl
  • DH-16463: Update Web UI to v0.63.0
  • DH-16463: Fix false positives when detecting layout changes
  • DH-16471: Added shortcut for copy version info
  • DH-16460: fix poor contrast color of notice message in share modal
  • DH-16455: Fix Download CSV in Web UI
  • DH-16458: Fix Swing ACL Editor requesting ACLs for null NS and TN
  • DH-16452: Disable table name dropdown when * ns is selected
  • DH-14914: Test core+ auto install
  • DH-16127: Fix readme for dnd version
  • DH-16373: ACL write server should enforce system user limitations
  • DH-16446: Legacy Parquet does not interpret LocalDate stored as int in Parquet format
  • DH-16315: ACL Write Server Should Prohibit Namespace=* without Tablename=*
  • DH-16326: io.deephaven.kv.acl.AclJetcdProvider Needs to Escape Data
  • DH-16336: Consistent handling of whitespace typing / pasting
019DH-16430: Drop Centos7, Use Rocky9 as Default
018Merge updates from 1.20231218.130
  • DH-16411: Integration test had duplicated serial number.
  • DH-16440: Fix Kubernetes restartAll script errors
  • DH-16025: Legacy BarrageTableResolver should return a table
017DH-16380: Update Envoy to 1.29, gRPC to 1.61, jetcd to 0.7.7
016Merge updates from 1.20231218.127
  • DH-16437: Make rocky9 require rsync-3.2, same as RHEL 9
015Merge updates from 1.20231218.126
  • DH-16426: Update Web UI to v0.61.0
  • DH-16426: Allow themes to use any srgb color for definitions
014Merge updates from 1.20231218.124
  • DH-16413: Non-superusers should have access to WebClientData tables
  • DH-16416: UserGroupArrayFilterGenerator should escape groups
  • DH-16277: When using Rollup Rows, ungrouped columns become sorted alphabetically and should not
  • DH-16302: Fix Merge/Validate queries adding an extra field to the PQ
  • DH-16371: Fix PQ Start/Stop actions inconsistently enabled/disabled
  • DH-16385: Envoy Does not Have Cluster/Route for Multiple Auth Servers
  • DH-16411: Dispatcher crashes when invalid WorkerKind is requested
  • DH-15771: Create Kubernetes Deephaven install/uninstall/upgrade wrapper script
  • DH-16217: Update buildAllForK8s.sh to use coreplus instead of dnd
  • DH-15955: Official installer support for rocky8/9
  • DH-16327: Fix Java 8 incompatibility
  • DH-16327: Properly URL encode ACL requests
  • DH-16362: allow dises+routing for complete import of routing config
  • DH-16362: revert allow dises+routing for complete import of routing config
  • DH-16362: allow dises+routing for complete import of routing config
013Merge updates from 1.20231218.112
  • DH-16350: fix installer keygen script for controller and acl write server
  • DH-16321: Duplicated values in a cluster.cnf file should cause a validation error
  • DH-15803: Improve error messaging around partitioned user table location overlap
  • DH-16368: Add Support for remote clusters with RemoteTableBuilder
  • DH-16632: Ensure worker to controller notifications (eg table errors) are not lost if controller restarts
  • DH-16369: Make internal installer overwrite versions when using pull_cnf
  • DH-15934: Routing config change for RemoteTableAppender in k8s
  • DH-16288: Hide k8s-related fields in query monitor when not deployed in k8s
  • DH-16251: Allow Core+ workers to load calendars from disk
  • DH-16082: Don't show RunAndDone queries in the Panels menu
012DH-16050: No prompt to confirm when clicking the "X" to close a running Code Studio
011Merge updates from 1.20231218.102
  • DH-16355: Kafka Community Test Fails After .079
  • DH-16219: Disallow namespaces and table names containing spaces at ACL API endpoint
  • DH-16287: Web API Server Reconnections Preserve Code Studio
  • DH-16331: Make PULL_CNF work for jenkins and local vm deploys
  • Revert DH-16251: Allow Core+ workers to load calendars from disk
  • SH-15353: Add client IP address to audit log for authentication events in web_api_server
  • DH-16251: Allow Core+ workers to load calendars from disk
  • DH-16312: ACL Editor - url encoding
  • DH-16333: db.livePartitionedTable error message misspelled
  • DH-15415: Fix jdk8 javadoc task
  • DH-16269: Add support for Core+ queries in irisapi examples
  • DH-16312: ACL Editor - Close selectors on select
  • DH-16314: ACL Editor - Only allow * table name when * ns is selected
  • DH-16139: Add cert expiry times to status dashboard
  • DH-15415: Improve ACL exceptions
  • DH-15521: Add official installer support for ubuntu 22.04
  • DH-16324: Fix DbAclEditorTableQuery canedit logic
  • DH-16313: Fixed NPE on legacy metadata overflow file access
  • DH-16235: Fix QM Summary out of sync with the Queries Grid
  • DH-15665: Remove internal installer workarounds for jackson+rhel9
  • DH-15864: Fix undefined partitions in IrisGridPanel state
  • DH-16318: Make iris_keygen.sh avoid adding to truststore when --skip-* flags are used
  • DH-16305: Fixes to get Deephaven working with IAP in Kubernetes
010Merge updates from 1.20231218.080
  • DH-16121: ACL Editor - Action tooltips
  • DH-16296: MySQL publickey table fails on Jackson to Vermilion+ Upgrade
  • DH-16298: New Installations Should default to etcd ACLs
  • DH-16307: DH_DND_VERSIONS should write "auto" not automatically selected version to cluster.cnf
  • DH-16297: Jackson to Vermilion Upgrade does not Create Python 3.10 Virtual Environment
  • DH-16058: Add Memory Printing to Tailer
  • DH-16286: MultiViewBuilder Test Must not Depend on Static Inheritance
  • DH-16150: Add widget plugins to handle widgets in Web
  • DH-14646: improvements after testing
  • DH-16265: Make LocalMetadataIndexer methods public
  • DH-16278: Automation Should Detect "Stuck" PQ Tests
  • DH-16243: Configure high-cpu integration test box on j17 CI
  • DH-16128: Fix grouping propagation when inputs are filtered
  • DH-16130: Ensure blank line in changelog is handled consistently.
  • DH-16202: QA cluster maintenance script usability
  • DH-15913: Segment parquet tests to an isolated high-CPU box
  • DH-16200: Fix Automation/src/test/resources/testScript/engine/updateby directory duplicity
  • DH-16131: update DH revision name map for QA results analysis query
  • DH-16087: Add HTTP security headers to Envoy configuration
  • DH-16192: Always set DH_ETCD_IMPORT_NODE to a single machine
  • DH-16181: Fixed MapCodec ignoring offset and length params
  • DH-16176: Backport of DH-15469 (Use external SSH executable for git)
  • DH-15157: CART Error Propagation and Reconnect Counting Fixes
  • DH-16128: Fix grouping propagation when inputs are filtered
  • DH-15876: Add Test Automation support for configuring java tests
  • DH-15493: Enable version suffixes for DbInternal tables
  • DH-16114: Test Automation: revert bad test case that was released
  • DH-16055: Fix sed substitution when numbers and wildcards overlap in vm-tools README
  • DH-16103: Remove etcd passwords from log output
  • DH-16014: Test Automation: add test case updates for December
  • DH-16108: Test Automation: fix NPE on template lookup
  • DH-16090: Test Automation: pull back integration logs even on fatal condition
  • DH-16078: Test Automation: run locally via installer
  • DH-15875: Allow disabled tests to run in testAutomation - control by config only
  • DH-15653: Add tagging to Test Automation
  • DH-15157: CART skipping reconnection attempts
  • DH-16098: update to test analysis query to remove duplicate data and add MinorVersion field
  • DH-16096: Better check for anonymous mysql users before we attempt to fix them
  • DH-16039: Reenable rhel9 installer test
  • DH-16096: Fix nightly installer test mysql error (anonymous user problem)
  • DH-14113: Use irisrw instead of root when possible in dbacl_init.sh
  • DH-16074: update controller tool tests to sudo use consistent with client env
  • DH-15988: fix logging error
  • DH-15275: Add release-focused testcases to Jackson July-Dec 2023
  • DH-16061: update controller_tool test for null pointer message
  • DH-16234: Publish PQ details into session scope
  • DH-16247: Update Core+ to Core 0.32.0
  • DH-16280: ACL Editor - Reset tablename selection when namespace changes
  • DH-16281: ACL Editor - Input table ACLs should not have "Columns" column in table view
  • DH-16258: iris-querymanagers should still see special queries in web
  • Update Web UI to v0.59.0
  • DH-16225: Fix TimeInput not triggering onChange
  • DH-16267: Light theme
  • DH-16056: "Query Types" filter doesn't show when it has been modified
  • DH-15857: Handle async due to gRPC internal state after Controller client subscription shutdown
  • DH-16261: Extra DIS routing backups
  • DH-16274: Add DIS routing integration tests
  • DH-16264: Fix unthemed legacy worker plots
  • DH-16258, DH-16259: Frontends display non displayable config types for non-admin users.
  • DH-15794: Add status dashboard helm chart
  • DH-16157: Core+ Cart should maintain a reference and manage the lifecycle of the ManagedChannel
  • DH-16266: Add Javadoc for Protobufs
  • DH-16209: Add dedicated volume for git repo in k8s envs
  • DH-16189: Pass all session objects to the Web UI
  • DH-16248: etcd/admin_init.sh should retry user existence check
  • DH-16250: Add deephaven.ui 0.1.0 to Core+ Workers
  • DH-16023: Allow enable.auto.commit check on Boolean.
  • DH-16211: Fix Controller shutdown held up
  • DH-16155: Support ACLs for non-existing namespaces and table names
  • DH-16246: Make DbAclCorsFeature use standard cors props as backup
  • DH-16003: Make management-shell a deployment, worker label value safeguards
  • DH-15840: Error in CART after controller restart
  • DH-16156: Inaccurate Error Message when using PQ+ resolver
  • DH-16157: Error message being logged when using RemoteTableBuilder
  • DH-16222: Add KafkaTableWriter.disNameWithStorage
  • DH-16212: Add CORS filter to workers / web api server
  • DH-16122: Refresh Query Monitor user and group lists on ACL Editor changes
  • DH-16236: Prevent possible improper labels in k8s metadata
  • DH-16147: Update DHE C++ and R client for DHC 0.31.0/0.32.0
  • DH-16204: Refactor Core+ locations to better support Hive format
  • DH-15857: Controller now allows clients to re subscribe.
  • DH-16201: Fix intellij-only error in buildSrc/build.gradle
  • DH-16141: Sort ProcessEventLog in performance queries
  • DH-16138: Backport relevant csv import fixes from grizzly to vermilion
  • DH-16226: Fix Grid panel state persistence
  • DH-16223: Don't wrap query summary lines if there is enough space
  • DH-15591: Fix QueryMonitor recovery from web api service/controller restart
  • DH-16231: Fix scheduling issues in restored PQs after controller restart
  • DH-16150: Support for loading module plugins from workers, deephaven.ui from Code Studio
  • DH-16227: Fix an attempt to log a null Throwable from PresenceLeaseHandlerEtcd.abort
  • DH-16203: Improper Global State in Core+ Python Client
  • DH-16215: Rename Python Core+ Client Wheel
  • DH-16057: Core+ python client exception when closing session after manager
  • DH-16218: Set WorkerProtocolRegistry host and ports for Core+ workers
  • DH-16100: Fix getObject failing after web api service/controller restart
  • DH-16210: Fix usage of paste command with explicit /dev/stdin
  • DH-10941: Make DH_FORCE_NEW_CERTS work correctly
  • DH-16196: KafkaTableWriter Transformation should take UpdateGraph lock (fix duplicate graph names)
009DH-16155: Support ACLs for non-existing namespaces and table names
008Merge updates from 1.20231218.034
  • DH-16122: Refresh Query Monitor user and group lists on ACL Editor changes
  • DH-16236: Prevent possible improper labels in k8s metadata
  • DH-16147: Update DHE C++ and R client for DHC 0.31.0/0.32.0
  • DH-16204: Refactor Core+ locations to better support Hive format
  • DH-15857: Controller now allows clients to re subscribe.
  • DH-16201: Fix intellij-only error in buildSrc/build.gradle
  • DH-16141: Sort ProcessEventLog in performance queries
  • DH-16138: Backport relevant csv import fixes from grizzly to vermilion
  • DH-16226: Fix Grid panel state persistence
  • DH-16223: Don't wrap query summary lines if there is enough space
  • DH-15591: Fix QueryMonitor recovery from web api service/controller restart
  • DH-16231: Fix scheduling issues in restored PQs after controller restart
  • DH-16150: Support for loading module plugins from workers, deephaven.ui from Code Studio
  • DH-16227: Fix an attempt to log a null Throwable from PresenceLeaseHandlerEtcd.abort
  • DH-16203: Improper Global State in Core+ Python Client
  • DH-16215: Rename Python Core+ Client Wheel
  • DH-16057: Core+ python client exception when closing session after manager
  • DH-16218: Set WorkerProtocolRegistry host and ports for Core+ workers
  • DH-16100: Fix getObject failing after web api service/controller restart
  • DH-16210: Fix usage of paste command with explicit /dev/stdin
  • DH-10941: Make DH_FORCE_NEW_CERTS work correctly
  • DH-16196: KafkaTableWriter Transformation should take UpdateGraph lock (fix duplicate graph names)
007DH-15689: Automate creation of WorkspaceSnapshot query
006Merge updates from 1.20231218.034
  • DH-16085: Add new fields to the Query Summary screen
  • DH-16196: KafkaTableWriter Transformation should take UpdateGraph lock
  • DH-16168: Update routing.yml for kubernetes installations
  • DH-16185: UpdatePerformanceLogCoreV2 is missing UpdateGraph
  • DH-16179: Rename ProcessUniqueId to ProcessInfoId in Core Performance Tables
  • DH-16177: Controller PQ ensureShutdown avoids trying to cancel processing requests never sent
  • DH-16182: Fix Python wrapper bypassing liveness defaults
  • DH-16163: SystemTableLogger Error in V+
  • DH-16169: PerformanceOverview Fails on Core+ Workers without Updates
  • DH-15912: Improve worker startup consistency in Kubernetes when cert-manager is enabled
  • DH-14646: dynamic dis management
  • DH-16158: Fix scheduled jobs loop on scheduled stop spamming the controller log file
  • DH-16154: Web ACL Editor is Failing Over Envoy
  • DH-16151: Fix stopping PQ after controller restart doesn't work.
  • DH-15890: Insure persistent query pod labels are always populated in k8s environments
  • DH-16137: Fix RemoteQueryDispatcher.workerServerPorts port range conflict with Linux ephemeral ports
  • DH-16111: Allow Flight Put requests for exports to support input tables an
  • DH-16094: Core+ workers survive controller restart step 3
  • DH-15800: Automatic Allocation of Kafka Resources in Kubernetes
  • DH-15695: Automatic Allocation of Kafka (In-Worker DIS) Resources in Kubernetes
  • DH-16135: Core+ workers should report update errors
  • DH-16136: Core+ performanceOverviewByPqName Timestamp Filter is Broken
  • DH-16120: Allow core+ pip to leverage pipcache
  • DH-16009: Fix auto-capitalization of field names in ProtobufDiscovery
  • DH-16123: Allow queryviewonly users to restart queries they have permissions to
  • DH-16054: Fix HierarchicalTables from Core+ workers not opening
  • DH-15890: Add persistent query info to worker pod labels in k8s
  • DH-16132: Upload installer next to tar/rpm in jfrog
  • DH-16110: ServiceRegistry.writers should include iris-dataimporters and iris-datamanagers by default
  • DH-16132: Delete obsolete installer upload task
  • DH-16116: Fix query monitor theme
  • DH-14599: Build launchers externally and download into iris
  • DH-16125: Add deephaven.remote_table to sphinx output
005Add Case for rc/grizzly in CL Check Action
004Fix forward merge Python beta patch version.
003Fix forward merge Python beta patch version.
002Merge updates from 1.20231218.007
  • DH-16105: Fix core+ nightly tests
  • Changelog format fix for deephaven.io
  • Javadoc fix.
  • Fix Python patch versions starting with "0"
  • DH-16037: CART needs to maintain AuthContext for internal Barrage subs
001Initial release creation from 1.20231218

Automatic Server Selection

Automated server-selection is now turned on by default on new non-Kubernetes installations. To turn it off or change the defaults, edit your iris-environment.prop file and remove or edit the properties from the iris_controller stanza, as described in automated server selection.

Changes to tailer process memory requirements

Tailer now uses constrained and pre-allocated buffer pools for both User and System tables

With default configuration, the tailer now uses constrained and pre-allocated buffer pools for sending data to Data Import Servers. This makes the tailer's memory consumption more predictable, and avoids potential out-of-memory conditions. There are separate pools for User and System tables, governed by the following properties:

PropertyDefaultDescription
DataContent.userPoolCapacity128The maximum number of User table locations that will be processed concurrently. If more locations are created at the same time, the processing is serialized.
DataContent.producerBufferSize.user256 * 1024The size in bytes of the buffers used to read and send data for User table locations.
DataContent.disableUserPoolfalseIf true, user table locations are processed without a constrained pool, in which case user actions can consume unbounded tailer resources.
DataContent.systemPoolCapacity128The maximum number of System table locations that will be processed concurrently. If more locations are created at the same time, the processing is serialized.
DataContent.producerBufferSize256 * 1024The size in bytes of the buffers used to read and send data for System table locations.
DataContent.disableSystemPoolfalseDisable the system pool, which results in an unconstrained number of buffers being used in the tailer.
DataContent.producersUseDirectBufferstrueProducers (the tailer) use a direct buffer for reading and sending data.

The tailer allocates two pools of buffers, one for user tables and one for system tables. Each item in that pool requires two buffers for concurrency, so the memory required will be double the buffer size times the pool capacity.

Total direct memory required for the tailer is approximately 2 * (DataContent.producerBufferSize * DataContent.systemPoolCapacity + DataContent.producerBufferSize.user * DataContent.userPoolCapacity).

Tailer now adjusts to DIS buffer size

When the tailer establishes a connection to a DIS, the processes now exchange configuration information. If the tailer is configured with a larger maximum message size than the DIS can accept, it will reduce it's maximum message size to match the DIS.

Change configuration to handle older tailers sending data to newer Data Import Servers

This section applies only to environments using default configuration values, and where tailers are outside the Deephaven system, either tailing from external systems or from different Deephaven installations. In this scenario, the tailer may send messages that are too large for the DIS to accept.

Older tailers do not accommodate to DIS settings, and may have default configuration settings that allow them to send larger messages than a DIS with default configuration can accept. This typically only happens when the tailer gets behind, or otherwise has file to process with more than 327,710 bytes.

When this happens, the DIS will reject the tailer with a message like:

WARN - DataImportServer-db_dis-TailerConnection-/127.0.0.1:55638-FullTableLocationKey{TableLookupKey.Immutable[Namespace/table/U],TableLocationLookupKey.Immutable[internal_partition/date]}:Rejecting and closing channel: Protocol error while processing stream: Message size 4400021 exceeds the maximum 327710. The client is configured with a larger maximum message size than this server. Check parameters DataContent.producerBufferSize and DataContent.consumerBufferSize. This most likely applies to the DIS and tailer processes.

A similar message might appear in the tailer log.

When an old tailer sends data to a new DIS, the DIS will log a warning like this:

WARN - Received TableIdentifierDataItem with old version: 4. The tailer will not protect against causing a read buffer overflow.

If your environment includes older tailers that send data to current DISes, you should make one of the following configuration changes:

  1. update the tailers to the current Deephaven version
  2. update the tailer configurations to set the max message size, by setting the following properties in the tailer configuration or properties file:
DataContent.producerBufferSize = 262144 
DataContent.producerBufferSize.user = 262144 
  1. update the DIS max message size to match the tailer max message size by setting the following properties in the DIS configuration or properties file:
DataContent.producerBufferSize = 2097160 

Note

The default max messages size from the tailer is reduced as of 20240517.262. Tailers respond to DIS configuration settings as of 20240517.403.

Database.inputTableUpdater improvements

The Database.inputTableUpdater method has been updated to return a new, more general interface io.deephaven.enterprise.database.inputtables.InputTableUpdater, which allows callers to have more control over table lineage with input tables. The InputTableUpdater can also be explicitly managed with a liveness scope, and is also now more efficient in caching intermediate operations.

When sourcing both an InputTableUpdater and an input Table view from db, the resulting objects may not share a common lineage:

def myTableUpdater = db.inputTableUpdater("MyNamespace", "MyTable")
def myTable = db.inputTable("MyNamespace", "MyTable")

myTableUpdater.add(new_data)
// myTable is not guaranteed to have new_data

With the new interface, callers can explicitly derive the corresponding input Table view from the InputTableUpdater:

def myTableUpdater = db.inputTableUpdater("MyNamespace", "MyTable")
def myTable = myTableUpdater.table()

myTableUpdater.add(new_data)
// myTable is guaranteed to have new_data

Note, the python layer does not currently expose a stand-alone equivalent of InputTableUpdater; an input Table view must be created, and in that context, table lineage is already guaranteed.

my_table = db.input_table("MyNamespace", "MyTable")
my_table.add(new_data)
# my_table is guaranteed to have new_data

Breaking API change for Core+ C++ Client

We have changed the map type used by deephaven_enterprise::controller::Subscription. This map type is defined in the typedef deephaven_enterprise::controller::Subscription::map_type and it is used in an out parameter of deephaven_enterprise::controller::Subscription::Current().

The important user-visible change is that the map's find() operation now returns a const iterator that interoperates with its begin() and end() operations. This makes it behave more like C++'s familiar std::map container. Previously, our find() operation returned a pointer to the mapped type rather than an iterator.

The following example code shows the new usage of find.

Assuming this code:

    std::int64_t version;
    Subscription::map_type map;
    if (!sub.Current(&version, &map)) {
      std::cout << "Subscription closed\n";
      return;
    }

Old API:

    const auto *pq_info_ptr = map.find(pq_serial);
    if (pq_info_ptr == nullptr) {
      std::cout << "pq_serial not found\n";
    }

    auto pq_info = *pq_info_ptr;

New API: find() now returns an iterator

    auto it = map.find(pq_serial);
    if (it == map.end()) {
      std::cout << "pq_serial not found\n";
    }
    
    auto pq_info = it->second;

Add dhconfig properties delete command

The dhconfig properties now supports a delete command, which can be used to delete a properties file from etcd.

usage: dhconfig properties delete [--configfile <arg>] [--diskprops] [--etcd] [-f <arg>] [-h] [-k <arg> | -user <arg>]
       [-pf <arg>]  [-v]
Delete properties files from the system.

Options are as follows:
    --configfile <arg>   use the named properties file instead of the default
    --diskprops          read properties from the classpath, instead of etcd or the configuration service
    --etcd               use etcd directly, instead of configuration service (when combined with --diskprops, property
                         files are read from disk but written to etcd)
 -f,--file <arg>         specify the properties files to delete
 -h,--help               print help for a properties command
 -k,--key <arg>          specify a private key file to use for authentication
 -pf,--pwfile <arg>      specify a file containing the base64 encoded password for the user that is set with --user
 -user,--user <arg>      specify a user for authentication
 -v,--verbose            print additional logging, progress messages, and full exception text

Examples:
Delete my-prop-file.prop:
    dhconfig properties delete --file my-prop-file.prop

Enable Systemic object tracking for Core+ workers

Systemic Object tracking has been enabled by default for Core+ workers. It increases the robustness of Persistent Queries by separating systematically important objects from user created ones, reducing the chance that misbehaving user requests cause outages. This default aligns Core+ worker behavior with Legacy worker behavior

Core+ updated to 0.37.0

The Core+ integration has been updated to Core version 0.37.0. As part of this update the Core+ Java Client libraries no longer support Java 8. Supported versions are now 11 and 17.

Python 3.8 is the oldest supported Python version

Even though Python 3.8 has already reached EOL, on some versions of Deephaven, this is the newest built + tested version of Python.

As of Bard version 1.20211129.426, Python 3.8 is the only Python version built, and iris-defaults.prop changes the default from Python 3.6 to 3.8.

If you still have virtual environments setup with Python 3.6 or 3.7, you should replace them with Python 3.8 venvs. To use newer versions of Python, upgrade to a newer version of Deephaven.

For legacy systems, you can change the default back to Python 3.6 by updating your iris-environment.prop to set the various jpy.* props to the values found in iris-defaults.prop, inside the jpy.env=python36 stanza:

# Legacy python3.6 locations:
jpy.programName=/db/VEnvs/python36/bin/python3.6
jpy.pythonLib=/usr/lib64/libpython3.6m.so.1.0
jpy.jpyLib=/db/VEnvs/python36/lib/python3.6/site-packages/jpy.cpython-36m-x86_64-linux-gnu.so
jpy.jdlLib=/db/VEnvs/python36/lib/python3.6/site-packages/jdl.cpython-36m-x86_64-linux-gnu.so

The new iris-defaults.prop python props are now:

# New iris-defaults.prop python3.8 locations:
jpy.programName=/db/VEnvs/python38/bin/python3.8
jpy.pythonLib=/usr/lib/libpython3.8.so
jpy.jpyLib=/db/VEnvs/python38/lib/python3.8/site-packages/jpy.cpython-38-x86_64-linux-gnu.so
jpy.jdlLib=/db/VEnvs/python38/lib/python3.8/site-packages/jdl.cpython-38-x86_64-linux-gnu.so

Changes to Barrage subscriptions in Core+ Python workers

The methods subscribe and snapshotTable inside deephaven_enterprise.remote_table have been changed to return a Python deephaven.table.Table object instead of a Java io.deephaven.engine.table.Table object. This allows users to use the Python methods update_view, rename_columns, etc. as expected without wrapping the returned table.

Existing Python code that manually wrapped the table or directly called the wrapped Java methods must be updated.

Example of previous behavior:

from deephaven_enterprise import remote_table as rt

table = rt.in_local_cluster(query_name="SubscribePQ", table_name="my_table").snapshot()
table = table.updateView("NewCol = random()")

Example of new behavior:

from deephaven_enterprise import remote_table as rt

table = rt.in_local_cluster(query_name="SubscribePQ", table_name="my_table").snapshot()
table = table.update_view("NewCol = random()")

Upgrading from releases before Grizzly 1.20240517.273

If this is your first upgrade from a release older than Grizzly 1.20240517.273, you must make some manual changes to your iris-environment.prop file before running helm upgrade or the dh_helm script. These changes are not automated because this file includes changes made by customers that should not be overwritten.

Create a bash session on the management shell. For example:

kubectl exec -it deploy/management-shell -- bash

Export and edit iris-environment.prop:

/usr/illumon/latest/bin/dhconfig properties export -f iris-environment.prop -d /tmp/
vi /tmp/iris-environment.prop

Remove these entries entirely. Note that these entries are not grouped together in the file, and some of them may not be present depending on what version you are upgrading from. If they are present, the first three are likely to be within the first 50 lines of your file, the two final properties are likely to be found within a scoping block about halfway down, and the remaining entries would be found toward the end of the file:

Kubernetes.chart.name={{ .Chart.Name }}
Kubernetes.release.name={{ .Release.Name }}
Kubernetes.release.namespace={{ .Release.Namespace }}

final configuration.reload.userGroups=superuser
final PersistentQueryController.keyPairFile=/etc/sysconfig/deephaven/syskey/priv-controllerConsole.base64.txt

RemoteQueryDispatcher.workerStartupTimeoutMS=240000
RemoteQueryDispatcher.workerControlType=Kubernetes
EmbeddedDbConsole.remoteDatabaseRequestTimeoutMS=60000

Kubernetes.deployment=true
Kubernetes.start-worker-timeout-seconds=240
Kubernetes.query-worker-k8s-template=/configs/k8s-query-worker-template.yaml
Kubernetes.start-worker-dnd-timeout-seconds=240
Kubernetes.query-worker-dnd-k8s-template=/configs/k8s-query-worker-coreplus-template.yaml

Webapi.server.cus.enabled=true
Webapi.server.cus.home=/etc/sysconfig/deephaven/cus
[service.name=web_api_service] {
    BusinessCalendar.storeRawData=true
}

RemoteProcessingRequestProfile.Xms.G1 GC=$RequestedHeap
RemoteQueryDispatcher.JVMParameters=-XX:+AlwaysPreTouch
BinaryStoreWriterV2.allocateDirect=false

authentication.server.localsuperusers.file=/etc/sysconfig/deephaven/superusers.txt

Import the modified file:

/usr/illumon/latest/bin/dhconfig properties import -f /tmp/iris-environment.prop

You are now ready to run the helm upgrade command or the dh_helm script to perform the upgrade.

Managed User Authentication flag is required for local password authentication

When the property iris.enableManagedUserAuthentication is set to true, user passwords may be stored in the ACL store (etcd or MySQL). In previous versions of Deephaven, if this was set to false existing passwords could still be used for authentication. When the property is set to false, the authentication server now rejects the password authentication. If you have changed this property to false from the default value of true, users must have alternative means of authenticating to the system (e.g., SAML or Active Directory) to use Deephaven with a password.

Added support for Iceberg tables in Core+

Core+ workers now support reading Iceberg tables as historical tables. To configure Iceberg, you must configure an Iceberg Endpoint, then discover and deploy a schema.

Configuring an IcebergEndpoint

The first step to link Iceberg tables into Deephaven is configuring an IcebergEndpoint, which we refer to as simply an endpoint. The endpoint contains the parameters required to locate and connect to the Iceberg catalog, the data warehouse, and the storage-specific parameters required to read the data. An endpoint is configured using the example below.

import io.deephaven.enterprise.iceberg.IcebergTools
import io.deephaven.extensions.s3.S3Instructions

// Create a new endpoint
endpoint = IcebergTools.newEndpoint()
        .catalogType("rest")                                 // The catalog is a REST catalog
        .catalogUri("http://mydata.com:8181")                // located at this URI. 
        .warehouseUri("s3://warehouse/")                     // The data warehouse is an S3 warehouse at this URI
        .putProperties("s3.access-key-id", "my_access_key",  // These are the properties required by the Iceberg API.
                       "client.region"   , "us-east-1")      // See https://iceberg.apache.org/docs/nightly/configuration/#configuration
        .putSecrets("s3.secret-access-key", "s3.key")        // Include any named secrets
        .dataInstructions(S3Instructions.builder()           // Configure the S3 data parameters
                .regionName("us-east-1")
                .build())
        .build("my_company_iceberg");                        // Explicitly name the endpoint.
from deephaven_enterprise import iceberg
from deephaven.experimental import s3

# Create the data instructions for reading data.
s3i = s3.S3Instructions(region_name="us-east-1")

# Create a new endpoint
endpoint = iceberg.make_endpoint("rest", \          # The catalog is a REST catalog
              "http://mydata.com:8181", \            # Located at this URI
              "s3://warehouse/", \                   # The data warehouse is an S3 warehouse at this URI
              s3i, \                                 # Attach the data instructions
              endpoint_name="my_company_iceberg", \  # Explicitly name this endpoint
              properties={"s3.access-key-id" : "my_access_key", "client.region" : "us-east-1"}, # Set Iceberg configuration properties. See https://iceberg.apache.org/docs/nightly/configuration/#configuration  
              secrets={"s3.secret-access-key" : "s3.key"})  # Include any named secrets

Properties

The properties component of the endpoint is a key-value map of Iceberg configuration parameters to their values. Valid property keys can be found in the Iceberg documentation at https://iceberg.apache.org/docs/nightly/configuration/#configuration.

Secrets

The secrets component is also a key, value map where the keys are Iceberg configuration properties, and the values are named references to secrets stored within Deephaven so you do not need to include secrets in script text. When needed, the secrets are retrieved from Deephaven and merged into the properties before being used to access Iceberg. Secrets may be stored either in the Deephaven configuration file as a property or as a JSON map in a protected file on disk. More sophisticated secret stores are possible. Contact support for more information. Secrets providers are visited in ascending priority order until one supplies a value or none can be found.

From Properties

Secrets may be stored in Deephaven configuration files as a simple property; for example, s3.access_key=1234secret4321. The default priority of the Properties secrets provider is 100 and can be adjusted using the property PropertiesSecretsProvider.priority.

From Files

Secrets may be stored in files on disk containing a simple JSON map. This format is more secure and better supports more complex secret values. You may configure multiple secrets files and their priorities using these properties:

PropertyDescription
FileSecretsProvider.name.pathThe path to the secrets file for the provider name
FileSecretsProvider.name.priorityThe priority of the secrets provider name

You may provide as many of these as you need, ensuring that each name is unique.

An example file:

{
  "s3.key" : "some_secret_key",
  "secret_url" : "https://verysecret.com:9001",
  "complicated" : "<Secret type=\"important\"><Internal>Secrecy</Internal></Secret>"
}

Deployment

The endpoint can be deployed to Deephaven configuration as long as a name has been provided. Once deployed, you may reference the endpoint by name in the schema for Iceberg tables to avoid duplication.

// Deploy the endpoint to Deephaven configuration, failing if it already exists.
endpoint.deploy(false)
# Deploy the endpoint to Deephaven configuration, Overwriting if it already exists.
# The overwrite_existing parameter defaults to False. The deployment will fail if the endpoint already exists.
endpoint.deploy(overwrite_existing=True)

Discovering an Iceberg table

Once an endpoint has been configured, you can discover an Iceberg endpoint to create and deploy a Deephaven schema. If you have previously configured and deployed an endpoint, you can retrieve it by name as well.

import io.deephaven.enterprise.iceberg.IcebergTools

// Load an endpoint that was already configured
endpoint = IcebergTools.getEndpointByName("my_company_iceberg")

// Discover a table derive the schema and deploy it, deriving the namespace and table name
// from the table identifier, referencing the endpoint by name.
discovery = IcebergTools.discover(DiscoveryConfig.builder()
                .tableIdentifier("market.trades")
                .endpoint(endpoint)
                .build())

discovery.deployWithEndpointReference()
from deephaven_enterprise import iceberg

# Load an endpoint that was already configured
endpoint = iceberg.get_named_endpoint("my_company_iceberg")

# Discover a table derive the schema and deploy it, deriving the namespace and table name
# from the table identifier, referencing the endpoint by name.
result = iceberg.discover("market.trades", endpoint)
result.deploy_named()

In the examples above, the Deephaven namespace and table name are derived directly from the Iceberg table identifier. You may specify your own by setting the namespace and tableName properties during discovery.

discovery = IcebergTools.discover(DiscoveryConfig.builder()
                .tableIdentifier("market.trades")
                .namespace("MarketUS")
                .tableName("EqTrades")
                .endpoint(endpoint)
                .build())
result = iceberg.discover(
    "market.trades", namespace="MarketUS", table_name="EqTrades", endpoint
)

Complete Examples

Below are complete examples that create an endpoint, discover a table, deploy a schema, and then fetch the table.

import io.deephaven.enterprise.iceberg.IcebergTools
import io.deephaven.enterprise.iceberg.discovery.DiscoveryConfig
import io.deephaven.extensions.s3.S3Instructions

// Create a new endpoint
endpoint = IcebergTools.newEndpoint()
        .catalogType("rest")                                 
        .catalogUri("http://mydata.com:8181")                 
        .warehouseUri("s3://warehouse/")                     
        .putProperties("s3.access-key-id", "access_key",
                       "client.region" , "us-east-1")
        .putSecrets("s3.secret-access-key", "s3.key")
        .dataInstructions(S3Instructions.builder()
                .regionName("us-east-1")
                .build())
        .build("my_company_iceberg");

endpoint.deploy(true)

discovery = IcebergTools.discover(DiscoveryConfig.builder()
                .tableIdentifier("market.trades")
                .endpoint(endpoint)
                .build())

data = db.historicalTable("market", "trades")
from deephaven_enterprise import iceberg
from deephaven.experimental import s3

s3i = s3.S3Instructions(region_name="us-east-1")

endpoint = iceberg.make_endpoint(
    "rest",
    "http://mydata.com:8181",
    "s3://warehouse/",
    s3i,
    endpoint_name="my_company_iceberg",
    properties={"s3.access-key-id": "access_key", "client.region": "us-east-1"},
    secrets={"s3.secret-access-key": "s3.key"},
)
endpoint.deploy(True)

result = iceberg.discover("market.trades", endpoint)
result.deploy_named()

data = db.historical_table("market", "trades")

Tailer Pool for System Tables

The tailer now uses data content pools for system tables, making the system table behavior controlled in a similar fashion to user table memory parameters. Before this change, a tailer servicing many system table partitions, could consume unbounded memory. The configuration properties that control the pool size are:

PropertyDefaultMeaning
DataContent.systemPoolCapacity128The number of items in the system pool, which is the number of system locations the tailer concurrently processes.
DataContent.disableSystemPoolfalseDisable the system pool, which results in an unconstrained number of buffers being used in the tailer.
DataContent.producersUseDirectBufferstrueProducers (the tailer) use a direct buffer for reading and sending data.

The data buffers are stored in either direct memory or heap memory, depending on the value of the DataContent.producersUseDirectBuffers property. Direct memory is recommended, because when passed an on-heap buffer the JVM allocates a new direct buffer while reading data from the file. If you adjust the size of the pool, then you should also adjust the tailer's maximum heap size or maximum memory size. The size of each buffer defaults to 256KB, but can be controlled by the DataContent.producerBufferSize property.

Vermilion+ Core+ updated to 0.35.2

Vermilion+ 1.20231218.440 includes version 0.35.2 of the Deephaven Core engine. This is the same version that ships with Grizzly in 1.20240517.189, enabling customers to have one Core engine version of overlap between major Deephaven Enterprise releases. Although the Core engine functionality is the same in 0.35.2, the Grizzly Core+ worker has several enhancements that are not available in the Vermilion+ Core+ worker. This change also updates grpc to 1.61.0

For details on the Core changes, see the following release notes:

Changes to vector support for Core+ user tables

Both the Legacy and Core engines have special database types to represent arrays of values. The Legacy engine uses the DbArray class, while the Core system uses the Vector class. While these implementations represent identical data, they pose challenges for interoperability between workers running different engines.

When a user table is written, the schema is inferred from the source table. Previously, Vectors would be recorded verbatim in the schema. This change explicitly encodes Vector types as their base java array types as follows.

Vector ClassConverted Schema Type
ByteVectorbyte[]
CharVectorchar[]
ShortVectorshort[]
IntVectorint[]
LongVectorlong[]
FloatVectorfloat[]
DoubleVectordouble[]
Vector<T>T[]

This makes it possible for the Legacy engine to read User tables written by the Core engine. Note that no conversion is made when the Legacy engine writes DbArray types because the Core+ engine already supports those types.

If you want your User table array columns to be Vector types, use an .update() or .updateView() clause to wrap the native arrays.

staticUserTable = db.historicalTable("MyNamespace", "MyTable")
                    .update("Longs = (io.deephaven.vector.LongVector)io.deephaven.vector.VectorFactory.Long.vectorWrap(Longs)")

Option close Tailer-DIS connections early, while continuing to monitor files

A new property is available to customize the behavior of the Tailer.

log.tailer.defaultIdlePauseTime

This property is similar to log.tailer.defaultIdleTime, but it allows the Tailer to close connections early while continuing to monitor files. When the idle time specified by log.tailer.defaultIdleTime has passed without any changes to a monitored file, the Tailer will close the corresponding connection to the DIS, and will not process any further changes to the file. The default idle time must therefore be as long as the default file rollover interval plus some buffer.

The new property enables a new feature. When the time specified by log.tailer.defaultIdlePauseTime has passed without any changes to a monitored file, the Tailer will close the corresponding connection to the DIS, but will continue to monitor the file for changes. If a change is detected, the Tailer will reopen the connection and process the changes. This will reduce or more quickly reclaim resources consumed for certain usage patterns.

Helm Chart Tolerations, Node Selectors and Affinity

You can now add tolerations, node selection, and affinity attributes to pods created by the Deephaven Helm chart. By default, no tolerations, selectors or affinity are added. To add tolerations to all created deployments, modify your values.yaml file to include a tolerations block, which is then copied into each pod. For example:

tolerations:
- key: "foo"
  operator: "Exists"
  effect: "NoSchedule"
- key: "bar"
  value: "baz"
  operator: "Equal"
  effect: "NoSchedule"

Adds the following tolerations to each pod (in addition to the default tolerations provided by the Kubernetes system):

Tolerations:                 bar=baz:NoSchedule
                             foo:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

Similarly, you can add a nodeSelector or affinity block:

nodeSelector:
- key: "foo"
  operator: "Exists"
  effect: "NoSchedule"
- key: "bar"
  value: "baz"
  operator: "Equal"
  effect: "NoSchedule"

affinity:
  nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: label
            operator: In
            values:
            - value1

Which result in pods containing node selectors like:

Node-Selectors:              key1=value1
                             key2=value2

And affinity as follows:

  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: label
            operator: In
            values:
            - value1
        weight: 1

Ability to disable password authentication in front-end

A new property, authentication.client.disablePasswordAuth=true, may be used to remove the username/password authentication option from the Swing front-end. The property has no effect if there are no other login-options available.

This property does not disable username/password authentication at the server level (see Disabling password authentication).

Allow config to override ServiceRegistry hostname

The hostname which Data Import Server (DIS) registers with the service registry may now be defined in the host tag within the DIS' routing endpoint of the routing configuration; or using the new ServiceRegistry.overrideHostname system property. The precedence for the service registry host is from:

  • The routing endpoint configuration. Prior to this change, the host value within the endpoint configuration was ignored.
  • ServiceRegistry.overrideHostname property.
  • On Kubernetes, the worker's service's hostname.
  • On bare metal, it is the result of the Java InetAddress.getLocalHost().getHostName() function.

Selection provider marking a query server as down

You can temporarily remove a query server from the list of servers that the controller uses for the Auto_Query and Auto_Merge groups. This allows the administrator to prevent new queries or consoles from being automatically assigned to that query server. The query server can still be selected manually and existing running queries are not evicted. This can be useful, for example, if a query server is malfunctioning, and you would like to remove it from the rotation, but still have access to it for debugging. To mark Query_1 as administratively down, use dhconfig pq selection-provider:

dhconfig pq selection-provider --command down --node Query_1

To mark the server as up:

dhconfig pq selection-provider --command up --node Query_1

The down state of a server is not persisted; on controller restart, all servers are marked as up. To permanently remove a server, you should re-run the Deephaven installer or edit the iris-endpoints.prop file and reload the controller.

Note: these commands are only relevant if you have the default SimpleServerSelectionProvider enabled.

Server Selection Provider Backoff

If an individual server fails to acquire workers, it often becomes the least-loaded server, resulting in the SimpleServerSelectionProvider assigning all new workers to that server. A new backoff policy has been introduced to prevent immediately assigning a worker to a server with an acquisition failure. Before assigning another worker to that server, the selection provider ensures that each of the other servers has had a worker assigned to them. Once a worker is successfully assigned to the query server, the state is cleared.

This new behavior prevents the algorithm from getting "stuck" assigning workers to the failed server but does periodically attempt to assign workers to handle transient failures or misconfigured queries.

To enable the new backoff behavior, set the configuration property:

SimpleServerSelectionProvider.FailureBackoffPolicy=ALL_OTHERS_AFTER_ACQUISITION_FAILURE

The default behavior remains the same as previous versions. To explicitly set this behavior, set the property to NONE:

SimpleServerSelectionProvider.FailureBackoffPolicy=NONE

Active Directory Group Synchronization

As with SAML, you can now synchronize groups from an Active Directory server to your Deephaven installation.

After a user authenticates with the Active Directory server, if group synchronization is enabled, the Authentication Server examines the memberOf attribute of the user's record. If the groups from Active Directory do not match the user's Deephaven groups, the relevant group names are added to or removed from the ACL store using the ACL write server.

The following properties control the behavior of group synchronization:

PropertyDefaultDescription
authentication.server.ldap.synchronizeGroupsfalseIf true, a user's groups are synchronized on user login.
authentication.server.ldap.sync.ignoregroups.directoryGroups to ignore from the directory server.
authentication.server.ldap.sync.ignoregroups.dhDeephaven groups to ignore when removing excess groups.
authentication.server.ldap.sync.mapgroup.<directory_group>Each directory group can be mapped to one or more Deephaven groups by adding additional properties.
authentication.server.ldap.sync.aclwriteuseririsThe username for ACL writes.

Column ACL Normalization

In previous versions, column ACLs were not normalized, meaning a system could contain column ACL keys that only differed by column order or duplicates, e.g.:

Group| Namespace|      TableName|   Columns|                                          Filter
----------+----------+---------------+----------+------------------------------------------------
test      |DbInternal|ProcessEventLog|b1        |new UsernameFilterGenerator("EffectiveUser")
test      |DbInternal|ProcessEventLog|b1,b1,b1  |new UsernameFilterGenerator("AuthenticatedUser")
test      |DbInternal|ProcessEventLog|b1,b2,b3  |new UsernameFilterGenerator("EffectiveUser")
test      |DbInternal|ProcessEventLog|b3,b2,b1  |new UsernameFilterGenerator("AuthenticatedUser")

Column specifications "b1" and "b1,b1,b1" refer to the same columns, as do "b1,b2,b3" and "b3,b2,b1". This is problematic, particularly when effectively equivalent column ACL keys have different filter values, as shown above. In this case, behavior is undefined.

The ACL write server attempts to normalize column ACLs on startup automatically. However, if there are effectively equivalent column ACLs with different filter values, automatic normalization fails with a report of the problematic column ACLs in the ACL write server logs. At this point, the Deephaven administrator must:

  1. Set property DbAclWriteServer.startup.normalizeColumnAcls=false.
  2. Restart the ACL write server.
  3. Remove problematic column ACLs.
  4. Set property DbAclWriteServer.startup.normalizeColumnAcls=true.
  5. Restart the ACL write server.

Continuing to run with de-normalized column ACLs is not recommended.

As always, before a software upgrade, you should backup your ACL database.

New Persistent Query range scheduling

A new Range scheduling option allows date/time ranges. A single start date/time and a single stop date/time can be specified.

  • If the start date/time is not specified, the Persistent Query is started immediately.
  • If the stop date/time is not specified, the Persistent Query is not stopped once it is started.

Removal of dhconfig base64 password options

The dhconfig tool no longer accepts the --password option for plaintext passwords to help reduce the chance of leaking secrets into shell history or log files. Interactive password authentication is still supported. Private key authentication should be used for non-interactive operations.

Managing users, groups, and ACLs with dhconfig

The dhconfig tool can now manage users, groups and ACLS using the subcommand acls. For example, you can add users to groups with:

dhconfig acl groups add-member --name user1 user2 --group groupA groupB

Complete help is available by running dhconfig acls --help. This example demonstrates one of the improvements enabled by using dhconfig; the old iris_db_user_mod tool cannot add multiple users to multiple groups in a single command.

All existing operations that were performed by /usr/illumon/latest/bin/iris iris_db_user_mod have been integrated into dhconfig, and the old iris_db_user_mod tool will be removed in a future release.

Run dhconfig acls --help for more detailed help. Each operation provides additional help information for its specific arguments; for example: dhconfig acls user --help provides help for user management commands.

DataImportServers respond to data routing changes

DataImportServers now respond to data routing changes. If the data routing configuration is changed (via the dhconfig routing or dhconfig dis commands), the DIS receives a notification of the change. The only change it can respond to is a change in the filters, either explicit changes to the filter or changes implied by claims made by this or other DISes. If any other configuration change is detected, the DIS prints error messages indicating that the configuration is out of date until it is restarted.

Disable changes (other than logging) with property:

DataImportServer.ignoreRoutingConfigChanges=true

DIS interface changes may require script updates

The DataImportServer interface changes may require script changes. Please see the sections below.

Deprecated methods

The following deprecated methods in com.illumon.iris.db.tables.dataimport.logtailer.DataImportServer have been removed:

DataImportServer(
    com.fishlib.io.logger.Logger,
    com.illumon.iris.db.v2.routing.DataImportServiceConfig,
    com.fishlib.configuration.Configuration)
DataImportServer.getDataImportServer(
    com.fishlib.io.logger.Logger,
    com.illumon.iris.db.v2.routing.DataImportServiceConfig,
    com.fishlib.configuration.Configuration,
    com.illumon.iris.db.schema.SchemaService)
DataImportServer.start()

Instead, use DataImportServer.startInWorker() or DataImportServer.start(StatsIntradayLogger.NULL)

The following methods are newly deprecated to support this functionality change and will be removed in the next release:

DataImportServer(
    com.illumon.iris.db.util.logging.DataImportServerLogFactory,
    com.illumon.iris.db.v2.routing.DataImportServiceConfig,
    com.fishlib.configuration.Configuration,
    com.illumon.iris.db.schema.SchemaService)
DataImportServer.getDataImportServer(
    com.illumon.iris.db.util.logging.DataImportServerLogFactory,
    com.illumon.iris.db.v2.routing.DataImportServiceConfig,
    com.fishlib.configuration.Configuration,
    com.illumon.iris.db.schema.SchemaService)

The replacement methods for constructing a DataImportServer are:

    /**
     * Construct a {@link DataImportServer}.
     *
     * @param logFactory     optional IrisLogCreator for access to loggers
     * @param disName        the name of the DIS configuration in the data routing service
     * @param configuration  the configuration to use when fetching settings that aren't included in disConfig
     * @param schemaService  the SchemaService to use
     * @param routingService the DataRoutingService to use for configuration
     * @param storageRoot    optional storage root to use if the DIS configuration specifies "private"
     */
    public DataImportServer(@Nullable DataImportServerLogFactory logFactory,
                            @NotNull final String disName,
                            @NotNull final Configuration configuration,
                            @NotNull final SchemaService schemaService,
                            @NotNull final DataRoutingService routingService,
                            @Nullable final String storageRoot);
    /**
     * Create a new DataImportServer instance, according to the configuration passed in.
     *
     * @param logCreatorParam optional IrisLogCreator, will be used to create logger. If null, a global instance will be used, if available.
     * @param disName         the name of the DIS configuration to use
     * @param configuration   the configuration to use when fetching settings that aren't included in disConfig
     * @param schemaService   the SchemaService to use
     * @param routingService  the DataRoutingService to use for configuration
     * @param storageRoot     optional storage root to use when configured storage is "private"
     * @return a new DataImportServer instance
     */
    public static DataImportServer getDataImportServer(@Nullable final DataImportServerLogFactory logCreatorParam,
                                                       @NotNull final String disName,
                                                       @NotNull final Configuration configuration,
                                                       @NotNull final SchemaService schemaService,
                                                       @NotNull final DataRoutingService routingService,
                                                       @Nullable final String storageRoot);

Script changes

Make the following changes to your scripts:

The new methods are more explicit about the data storage location and need a DataRoutingService to respond dynamically to data routing configuration changes.

  • Instead of getting a DataImportServiceConfig from the DataRoutingService, pass the DataRoutingService and configuration name as parameters.
  • Specify a storage location or leave it as null if the storage is specified in the DIS configuration.

This example shows the required changes from a previously documented example.

Before:

routingService = DataRoutingServiceFactory.getDefault()

// if the dis configuration specifies "private" storage, then you must use getDataImportServiceConfigWithStorage
disConfig = routingService.getDataImportServiceConfigWithStorage("Ingester1", "/db/dataImportServers/Ingester1")
// this call assumes named storage configured in the routing file
//disConfig = routingService.getDataImportServiceConfig("Ingester1")
dis = DataImportServer.getDataImportServer(ProcessEnvironment.getDefaultLog(), disConfig, Configuration.getInstance(), SchemaServiceFactory.getDefault())
dis.startInWorker()

After:

routingService = DataRoutingServiceFactory.getDefault()

// if the dis configuration specifies "private" storage, then you must provide a storage root, otherwise set this to null
storage = "/db/dataImportServers/Ingester1"
disName = "Ingester1"

dis = DataImportServer.getDataImportServer(
        null, 
        disName, 
        Configuration.getInstance(),
        SchemaServiceFactory.getDefault(),
        routingService,
        storage)
dis.startInWorker()

Alternate launch option for Deephaven Core+ local workers

Deephaven launches local worker processes by creating a command using various system configurations and user-specified inputs, such as additional classpath or memory settings, and executes that process directly. With this release, it is possible to configure Core+ workers so that the command is supplied to another process that can launch it. As an example, this would allow for configuring the system to run workers that are authenticated with a Kerberos keytab.

The default Core+ worker is launched with this command:

$ /usr/illumon/latest/bin/deephaven-coreplus ..

If your configuration provides a commandPrefix for your Core+ worker kind, then the default command will be provided to the command you specify as the prefix.

As an example, this configuration setting:

WorkerKind.DeephavenCommunity.commandPrefix=k5start -f my-krb5.keytab -U -- /bin/sh -c

Results in the workers being launched like this:

$ k5start -f my-krb5.keytab -U -- /bin/sh -c '/usr/illumon/latest/bin/deephaven-coreplus ..'

By default, the commandPrefix is tokenized by spaces. If you require a different delimiter, you can define it with this property:

# Use a comma instead of the default space
WorkerKind.DeephavenCommunity.commandPrefixDelimiter=,

Optional lenient IOJobImpl to avoid write queue overflow

New behavior is available to avoid write queue overflow errors in the TDCP process. When a write queue overflow condition is detected, the process can be configured to delay briefly - giving the queue a chance to drain.

The following properties govern the feature:

IOJobImpl.lenientWriteQueue
IOJobImpl.lenientWriteQueue.retryDelay
IOJobImpl.lenientWriteQueue.maxDelay

Set IOJobImpl.lenientWriteQueue=true to enable this behavior. By default, the writer will wait up to IOJobImpl.lenientWriteQueue.maxDelay=60_000 ms in increments of IOJobImpl.lenientWriteQueue.retryDelay=100 ms.

This should address the following fatal error in the TDCP process:

ERROR - job:1424444133/RemoteTableDataService/10.128.1.75:37440->10.128.1.75:22015 write queue overflow: r=true, w=true, p=false, s=false, u=false, h=0, rcap=69632, rbyt=0, rmax=4259840, wbyt=315407, wspc=1048832, wbuf=4097, wmax=1048576, fc=0, allowFlush=true

Persistent Query Replicas and Spares

Persistent Queries now support redundancy and automatic failover with the replicas and spares feature. Any Live Persistent Query can now specify a number of Replica and Spare workers to start. Replica workers are identical copies of the query that the Controller distributes user load across. Spares are identical copies used by the controller to replace failed replicas. When more than one replica is running, the replica workers are independent from each other. Replica queries are suitable for UI interactions, but care must be taken that the replicas do not perform actions that would be unsafe to concurrently perform without coordination.

Administrators of a query (including members of the iris-querymanagers or iris-superusers group) see the state of all replicas in the Query Monitor. A viewer only sees the replica that they are assigned to in the Query Monitor.

Load Balancing

When a query is configured with more than one replica, the controller uses a configurable policy called an assignment policy to distribute users across all available workers. Each Persistent Query Configuration has a number of "slots" equal to the number of replicas. An assignment policy, which is sticky to prevent a single user's UI-driven actions from negatively impacting more than one replica, assigns a user to a specific slot.

By default, Deephaven provides a single "Round Robin" assignment policy that distributes users evenly across the available workers. Customers may implement the io.deephaven.enterprise.controller.assignment.AssignmentPolicy interface to provide custom assignment policies.

Assignment policies are configured by adding values to your property file as follows:

PersistentQueryController.AssignmentPolicy.PolicyName.class=com.company.MyPolicyClass
PersistentQueryController.AssignmentPolicy.PolicyName.displayName=A Display Name
PersistentQueryController.AssignmentPolicy.PolicyName.description=Describes the policy

The assignment policy receives two callbacks whenever the number of replicas change. The first callback occurs before any currently running replicas are shut down so that users can be reassigned. The second callback occurs after the changes are applied so that user load can be redistributed.

Automatic Failover

When a query is configured with spares and a replica crashes for any reason, the controller will immediately replace the crashed worker with a running spare if it is available. Each replica slot independently respects the "Error Restart Attempts" scheduling parameter of the query configuration. This can prevent a single slot from continually claiming a spare query to the exclusion of other replicas.

Additional Changes

A new DbInternal table QueryUserAssignmentLog has been added so that Assignment Policies can log how users are assigned to available replicas.

-add_acl 'new UsernameFilterGenerator("QueryOwner")' -group allusers -namespace DbInternal -table QueryUserAssignmentLog -overwrite_existing
-add_acl * -group iris-superusers -namespace DbInternal -table QueryUserAssignmentLog -overwrite_existing

Server Selection Provider Replica Distribution

The default SimpleServerSelectionProvider attempts to more evenly spread replicas and spares of a single persistent query across the cluster so that a single server failure is less likely to impact all replicas of a running query.

The IServerSelectionProvider interface now requires three additional methods to be implemented:

  • addWorkerForSerial is invoked when a worker for a persistent query is assigned to a server
  • removeWorkerForSerial is invoked when a worker for a persistent query is terminated
  • generateStatusInformation is generates internal state suitable for display by dhconfig pq selection-provider

Option to default all user tables to Parquet

Set the configuration property db.LegacyDirectUserTableStorageFormat=Parquet to default all direct user table operations, such as db.addTable, to the Parquet storage format. The default if the property is not set is DeephavenV1.

Default Landing Page

The default landing page for a Deephaven installation now automatically redirects to the web UI. To download the Swing Launcher, go to the /launcher path (e.g., https://deephaven.example.com:8123/launcher/ or https://deephaven.example.com:8000/launcher/).

Disabling Password Authentication

To disable password authentication within the authentication server, set the configuration property authentication.passwordsEnabled to false. When the property is set to false, the authentication server rejects all password logins and you must use SAML or private key authentication to access Deephaven.

Note that even if the UI presents a password prompt, the authentication backend rejects all passwords.

Deephaven processes log their heap usage

The db_dis, web_api_service, log_aggregator_service, iris_controller, db_tdcp, and configuration_server processes now periodically log their heap usage.

PersistentQueryController.log.current:[2024-05-10T15:00:32.365219-0400] - INFO - Jvm Heap: 3,972,537,856 Free / 4,291,624,960 Total (4,291,624,960 Max)
PersistentQueryController.log.current:[2024-05-10T15:01:32.365404-0400] - INFO - Jvm Heap: 3,972,310,192 Free / 4,291,624,960 Total (4,291,624,960 Max)

The logging interval can be configured using the property RuntimeMemory.logIntervalMillis. The default is one minute.

Worker Additional Memory Request

When creating a Code Studio or Persistent Query, Deephaven allows you to configure the heap size of the Java virtual machine running your process. The RemoteQueryDispatcher uses this parameter to not only to pass appropriate arguments to the Java process, but also to account for the RAM used by workers and when running in Kubernetes to set the resource requests and limits.

A Deephaven worker can use off-heap memory in various circumstances. For example, Java libraries may use direct buffers to reduce the overhead on the garbage collector or to improve I/O performance. When executing a Python worker, Python object are not part of the Java heap, but are instead allocated and garbage collected by the Python interpreter's memory management subsystem. Similarly, any native libraries also use off-heap memory.

If your worker process uses such off-heap memory, can request that memory from the dispatcher using the "Additional Memory" field under "Advanced Settings" in the Code Studio start screen and Persistent Query Settings tab. This is especially important on Kubernetes, where if a process's memory usage exceeds the container's limit the OS kernel forcibly kills the process.

Authorized Keys and Local Users Text Files

Deephaven no longer uses the dsakeys.txt or authusers.txt files by default. The Deephaven 1.20221001 release added support for storing public keys in the ACL store; and passwords hashes can be stored in the ACL store when iris.enableManagedUserAuthentication is set to true (which has been the default in iris-environment.prop in all supported software versions). Passwords and public keys can be manipulated with dhconfig acl user set-password and dhconfig acl publickey, respectively.

On upgrade, any keys stored in dsakeys.txt are migrated to the ACL store. On installation or upgrade, system keys for the controller, merge server, and TDCP are written to the ACL store.

To re-enable text file based keys, the following properties must be set in your iris-environment.prop file:

    authentication.server.localusers.enabled=true
    authentication.server.localusers.file=<name>
    authentication.server.authorizedkeys.enabled=true
    authentication.server.authorizedkeys.file=<name>

In-Worker Service Persistent Query Type Removed

The previously deprecated In-Worker Service Persistent Query type has been removed. Any existing persistent queries that are an In-Worker Service should be converted to use an appropriate script type before upgrading Deephaven. Most existing In-Worker Service queries are likely Data Import Servers, which can be converted with a script to a type of Live Query - Merge Server.

etcd configuration parameter 'bcrypt-cost: 4' added

When a system is installed, the etcd configuration file config.yaml adds the configuration parameter bcrypt-cost: 4.

This affects creation of new etcd users used internally by the Deephaven software. No changes are needed for Deephaven usernames or passwords stored in etcd. For a newly installed system this change applies immediately.

On a system that is being updated from an existing version, the installer will not modify the etcd configuration files; this can be done manually: on each node running etcd, modify the file /etc/etcd/dh/ETCD_CLUSTER_ID/config.yaml (replace ETCD_CLUSTER_ID by your etcd cluster id, a string looking something like 'cfa6c474b'). On each config.yaml file, add a line containing

bcrypt-cost: 4

On a system where the config file has been modified to add the new bcrypt cost value, this new value will only apply for new users created after the change, or for old users if their passwords are updated after the change; for existing users that do not update their password nothing changes.

A script redo-etcd-passwords is provided under the bin directory that updates the passwords for all etcd users except root; the password is not changed from the actual password value, but updating the password to the same value has the side effect that the user will use the new configuration parameter when the password is validated. Run the script as the irisadmin user while the etcd cluster is up but the Deephaven Enterprise system is down.

Background

etcd uses go's bcrypt function to hash passwords. Passwords are stored on disk hashed, to avoid compromising them immediately if an attacker is able to observe the data on disk (this is similar to how Unix/Linux systems store passwords hashed in the /etc/passwd file). When a user tries to authenticate with etcd and provides a password, etcd hashes the password again and compares the hash with the value stored on disk. The bcrypt function is computationally intensive; this is intentional to prevent an attacker from trying to guess passwords from being able to probe passwords quickly. This, however, has an impact in a system like Deephaven Enterprise:

  • Deephaven Enterprise creates workers on demand, sometimes starting them as scheduled persistent queries. When many workers are starting at the same time, an etcd server machine can become very busy validating credentials via bcrypt for many new clients.

  • When an etcd server that is leader fails, and a new server takes over as leader, many clients switch to the new server, all of them requiring the server to execute the bcrypt function to validate credentials; this can load a machine at the worst possible moment from a fault tolerance perspective.

For these reasons, it is desirable to reduce the cost of bcrypt to the minimum configurable. The default value used by bcrypt's go library (and etcd) is 10. The minimum value is 4. This configuration change sets the value explicitly to 4, where we previously were not configuring it and thus using the default.

etcd stores the bcrypt value used when a password is created or modified, to know at what value of bcrypt cost to validate the hash. Therefore, when the bcrypt cost value is changed it only applies for new users or after updating an existing user's password.

/usr/illumon/dnd changed to /usr/illumon/coreplus

The installer now uses /usr/illumon/coreplus for packages instead of /usr/illumon/dnd to reflect the updated Core+ name.

Kubernetes Heap Overhead Parameters

When running Deephaven installations in Kubernetes, the originally-implemented JVM overhead properties don't prevent some workers being killed with out-of-memory exceptions.

  • Adding the BinaryStoreWriterV2.allocateDirect=false JVM parameter reduces direct memory usage, which is not counted towards dispatcher heap usage and can result in Kubernetes out-of-memory failures.
  • Adding the -Xms JVM parameter allocates all requested heap at worker creation time, reducing the likelihood of after-startup worker out-of-memory failures from later memory requests.
  • Adding the -XX:+AlwaysPreTouch JVM parameter to workers ensures that all worker heap is touched during startup, avoiding later page-faulting.

The following properties are being added to iris-environment.prop for new installations. Deephaven strongly suggests adding them manually to existing installations.

RemoteProcessingRequestProfile.Xms.G1 GC=$RequestedHeap
RemoteQueryDispatcher.JVMParameters=-XX:+AlwaysPreTouch
BinaryStoreWriterV2.allocateDirect=false

In addition, the property RemoteQueryDispatcher.memoryOverheadMB=500 is being updated in iris-defaults.prop, and this will automatically be picked up when the Kubernetes installation is upgraded.

dhconfig replaces etcd_prop_file

The etcd_prop_file tool has been removed. You must use dhconfig properties import instead.

For recovery or bootstrapping purposes, the --diskprops option can be used in conjunction with the --etcd option to read property files from disk and import new files into etcd.

Dispatcher Memory Reservation

The Remote Query Dispatcher (either db_query_server or db_merge_server) has a configurable amount of heap that can be dispatched to workers, which is controlled by setting the RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB property. Setting this property requires accounting the other processes that may be running on the machine. If set too high, then workers may fail to allocate memory after being dispatched after dispatch or the kernel OOM killer may terminate processes. If set too low, then the machine may be underutilized.

As an additional safety check, the Remote Query Dispatcher can query the /proc/meminfo file for available heap. If a user requests more heap than the MemAvailable field indicates can be allocated to a new process, then the remote query dispatcher can reject scheduling the worker.

There are two new properties that control this behavior:

  • RemoteQueryDispatcher.adminReservedAvailableMemoryMB; for users that are members of RemoteQueryDispatcher.adminGroups. By default this is set to 1024MiB.
  • RemoteQueryDispatcher.reservedAvailableMemoryMB: For all other users. By default this is set to 2048MiB.

When set to -1, the additional check is disabled. When set to a non-negative value the dispatcher subtracts the property's value from the available memory, and verifies that the worker heap is less than this value before creating the worker.

You can examine the current status of properties, using the /config endpoint if RemoteQueryDispatcher.webserver.enabled is set to true. For example, navigate to `https://query-host.example.com:8084/config'. The available memory along with property values are displayed as an HTML table.

This property does not guarantee that workers or other processes are not terminated by the OOM killer. Running workers and processes may not have allocated their maximum heap size, and therefore can use system memory beyond what is available at dispatch time.

These properties have no effect on Kubernetes deployments. The dispatcher does not have an complete view of the available cluster resources.

ILLUMON_JAVA is deprecated. Use DH_JAVA instead.

In the past, specifying which version of java to use with Deephaven was done with the ILLUMON_JAVA and it was applied inconsistently.

In this release, you can set DH_JAVA=/path/to/java_to_use/bin/java in your cluster.cnf to tell all Deephaven processes where to find the correct java executable regardless of your PATH.

DH_JAVA works correctly whether you point to a java executable or a java installation directory (like "JAVA_HOME") Both DH_JAVA=/path/to/java_to_use and DH_JAVA=/path/to/java_to_use/bin/java operate identically.

If different machines in your cluster have java executables located in different locations, it is your responsibility to set DH_JAVA correctly in /etc/sysconfig/deephaven/cluster.cnf on each machine, or (preferably) to use a symlink so you have a consistent DH_JAVA location on all machines.

Managing Persistent Queries with dhconfig

The dhconfig tool can now manage Persistent Queries using the subcommand pq. For example, you can restart a query with:

dhconfig pq restart --name "Persistent Query Name"

Complete help is available by running dhconfig pq --help. This example demonstrates one of the improvements enabled by using dhconfig: the old controller tool does not handle query names with spaces. All existing operations that were performed by /usr/illumon/latest/bin/iris controller_tool have been integrated into dhconfig, and the old controller tool will be removed in a future release. See the help pages for more details dhctl controller --help

Legacy C# Open API Client Removed

Deephaven no longer builds or maintains the Legacy C# Open API client. C# Binary Loggers and the C# SBE client are still supported.

The Excel plugin for Legacy workers which depended on this client is also no longer supported.

Core+ Controller Python Imports

From Core+ Python workers, you may now import Python modules from repositories stored in the controller. To evaluate a single Python file:

import deephaven_enterprise.controller_import

deephaven_enterprise.controller_import.exec_script("script_to_execute.py")

To import a script as a module, you must establish a meta-import with a module prefix for the controller. The following example uses the default value of "controller" to load a module of the form "package1/package2.py" or "package1/package2/__init__.py":

import deephaven_enterprise.controller_import

deephaven_enterprise.controller_import.meta_import()

import controller.package1.package2

Refreshing Local Script Repositories

The Persistent Query Controller defines a set of script repositories that can be used from Persistent Queries or Code Studios. The repositories may be configured to use a remote Git repository or just a path on the local file system. The controller scans the repository on startup for the list of scripts that are available. Previously, only Git repositories could have updates enabled (once per minute); and local repositories would never be rescanned.

You can now set the property PersistentQueryController.scriptUpdateEnabled to true to enable script updates. If this property is not set, then the old PersistentQueryController.useLocalGit property is used (the old property has an inverse sense, meaning PersistentQueryController.useLocalGit=true stops updates and PersistentQueryController.useLocalGit=false permits updates) .

To mark a repository as local, the "uri" parameter must be set to empty. For example, if the repository was reffered to as "irisrepo" in the iris.scripts.repos property, then to mark the repository as local you would include a property like in your iris-environment.prop file:

iris.scripts.repo.irisrepo.uri=

Fixing etcd ACLs that broke after upgrading to URL encodings

Note that the following is only applicable to etcd ACLs.

In 1.20231218.116 and 1.20231218.132, Deephaven began URL encoding ACL keys to prevent special characters like '/' in keys from corrupting the ACL database. Although not all special characters corrupted the database, all of them are encoded, causing the unencoded database to be incompatible with the new version. A common occurrence of this pattern is the "@" character in usernames.

These ACL entries can be fixed using the EtcdAclEncodingTool.

First, back up your etcd database by reading our backup and restore instructions.

To rewrite these ACLs with proper encodings, run the following command as irisadmin:

sudo -u irisadmin /usr/illumon/latest/bin/iris_exec com.illumon.iris.db.v2.permissions.EtcdAclEncodingTool

To see what changes would occur without actually modifying the ACLs, run:

sudo -u irisadmin /usr/illumon/latest/bin/iris_exec com.illumon.iris.db.v2.permissions.EtcdAclEncodingTool -a --dry-run

Setting JVM JIT Compiler Options for Workers

The ability to set the maximum number of allowed JVM JIT compiler threads through the -XX:CICompilerCount JVM option has been added to JVM profiles using properties of the form RemoteProcessingRequestProfile.JitCompilerCount. See the remote processing profiles documentation for further information.

Upgrade etcd to 3.5.12

In past releases, we recommended upgrading etcd to 3.5.5.

It was later discovered that 3.5.5 has a known bug which can break your etcd cluster if you perform an etcdctl password reset.

As such, when upgrading etcd, you should prefer the Deephaven-tested 3.5.12 point release, which is the new default as of version 1.20231218.190.

All newly created systems will have 3.5.12 installed, but for existing systems, you must unpack new etcd binaries yourself.

You can find manual etcd installation instructions in the third-party dependencies guide.

Configurable gRPC Retries

The configuration service now supports using a gRPC service configuration file to configure retries, and one is provided by default for the system.

{
  "methodConfig": [
    {
      "name": [
          {
              "service": "io.deephaven.proto.config.grpc.ConfigApi"
          },
          {
              "service": "io.deephaven.proto.registry.grpc.RegistryApi"
          },
          {
              "service": "io.deephaven.proto.routing.grpc.RoutingApi"
          },
          {
              "service": "io.deephaven.proto.schema.grpc.SchemaApi"
          },
          {
              "service": "io.deephaven.proto.processregistry.grpc.ProcessRegistryApi"
          },
          {
              "service": "io.deephaven.proto.unified.grpc.UnifiedApi"
          }
      ],

      "retryPolicy": {
        "maxAttempts": 60,
        "initialBackoff": "0.5s",
        "maxBackoff": "2s",
        "backoffMultiplier": 2,
        "retryableStatusCodes": [
          "UNAVAILABLE"
        ]
      },

      "waitForReady": true,
      "timeout": "120s"
    }
  ]
}

methodConfig has one or more entries. Each entry has a name section with one or more service/method sections that filter whether the retryPolicy section applies.

If the method is empty or not present, then it applies to all methods of the service. If service is empty, then method must be empty, and this is the default policy.

The retryPolicy section defines how a failing gRPC call will be retried. In this example, grpc will retry for just over 1 minute while the status code is UNAVAILABLE (e.g. the service is down). Note this applies only if the server is up but the individual RPCs are being failed as UNAVAILABLE by the server itself. It the server is down, the status returned is UNAVAILABLE but the retryPolicy defined here for the method does not apply; gRPC manages reconnection retries for a channel separately/independently as described here: https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md

There is no way to configure the parameters for reconnection; see https://github.com/grpc/grpc-java/issues/9353

If the service config file specifies waitForReady, then an RPC executed when the channel is not ready (server is down) will not fail right away but will wait for the channel to be connected. This, combined with a timeout definition will make the RPC call hold on for as long as the timeout giving the reconnection policy a chance to get the channel to ready.

For Deephaven processes, customization of service config can be done by (a) copying configuration_service_config.json to /etc/sysconfig/illumon.d/resources and modifying it there, or (b) renaming it and setting property configuration.server.service.config.json.

Note that the property needs to be set as a launching JVM argument because this is used in the gRPC connection to get the initial properties.

Note: The relevant service names are:

io.deephaven.proto.routing.grpc.RoutingApi
io.deephaven.proto.config.grpc.ConfigApi
io.deephaven.proto.registry.grpc.RegistryApi
io.deephaven.proto.schema.grpc.SchemaApi
io.deephaven.proto.unified.grpc.UnifiedApi

Add Core+ Calendar support and allow Java ZoneId strings in Legacy Calendars

Core+ workers can use the Calendars.resourcePath property to load customer provided business calendars from disk. To use calendars in Core+ workers, any custom calendars on your resource path must be updated to use a standard TimeZone value.

Legacy workers also support using ZoneId strings instead of DBTimeZone values.

Dynamic management of Data Import Server configurations

Creating a new Data Import Server configuration and integrating it into the Deephaven system requires several steps, including required adjustments to the data routing configuration. This final step can now be performed with a few simple commands, and no longer requires editing the data routing configuration file.

dhconfig dis

The dhconfig command has a new action: dis, which supports import, add, export, list, delete, validate actions. The commands themselves provide help, and more information can be found in the dhconfig documentation.

dhconfig dis import

Import one or more configurations from one or more files. For example:

/usr/illumon/latest/bin/dhconfig dis import /path/to/kafka.yml

kafka.yml

kafka:
  name: kafka
  endpoint:
    serviceRegistry: registry
    tailerPortDisabled: 'false'
    tableDataPortDisabled: 'false'
  claims:
  - {namespace: Kafka}
  storage: private

dhconfig dis add

Define and import a single configuration on the command line. For example (equivalent to the import example above):

/usr/illumon/latest/bin/dhconfig dis add --name kafka --claim Kafka

dhconfig dis export

Export one or more configurations to one or more files. These files are suitable for the import command. For example, to export all configured Data Import Servers:

/usr/illumon/latest/bin/dhconfig dis export --file /tmp/import_servers.yml

dhconfig dis list

List all configured Data Import Servers. For example:

/usr/illumon/latest/bin/dhconfig dis list
Data import server configurations:
    kafka
    kafka3

dhconfig dis delete

Delete one or more configurations. For example:

/usr/illumon/latest/bin/dhconfig dis delete kafka --force

dhconfig dis validate

Validate one or more configurations. This can validate proposed changes before committing them with the import command. This process verifies that the configuration as a whole will be valid after applying the new changes.

Caveats

"Data routing configuration" comprises the "main" configuration file (managed with dhconfig routing) and additional DIS configurations. The main routing configuration may contain DIS configurations in the dataImportServers section. These two sources of DIS configurations are managed separately and are not permitted to contain duplicates. If you want to manage an existing DIS configuration with the new commands, you must remove it from the main routing configuration.

This functionality will only be useful for querying data if the routing configuration includes "all data import servers" using the dataImportServers keyword. This is usually a source under the db_tdcp table data service:

    db_tdcp:
      host: *localhost
      port: *default-tableDataCacheProxyPort
      sources:
        - name: dataImportServers

A DIS configuration requires storage. The special value private indicates that the server will supply its own storage location. Any other value must be present in the storage section of the routing configuration.

Update jgit SshSessionFactory to a more modern/supported version

For our git integration, we have been using the org.eclipse.jgit package. Github discontinued support for SHA-1 RSA ssh keys, but jgit's ssh implementation (com.jcraft:jsch) does not support rsa-sha2 signatures and will not be updated. To enable stronger SSH keys and provide GitHub compatibility, we have configured jgit to use an external SSH executable by setting the GIT_SSH environment variable. The /usr/bin/ssh executable must be present for Git updates.

Restartable Controller

If the iris_controller process restarts quickly enough, Core+ workers that were already initialized and running normally by the time the controller restarted continue running without interruption. Legacy workers still terminate on controller restart.

  • The duration that workers can survive without the controller is defined by the property PersistentQueryController.etcdPresenceLeaseTtlSeconds, which defaults to 60 (seconds).
  • Only workers that have completed initialization and are in the Running state before the crashed controller died and which should still be running by that time, according to their query configuration at the time of controller restart.

If the iris_controller is stopped normally (e.g., via monit stop or a regular UNIX TERM signal), the value of the property PersistentQueryController.stopWorkersOnShutdown determines the desired behavior for workers.

  • When set to true, all controller-managed workers are stopped alongside the controller. This is consistent with the traditional behavior.
  • When set to false (the new default), workers do not stop alongside the controller, and have the time defined in the property PersistentQueryController.etcdPresenceLeaseTtlSeconds (defaults to 60 seconds) as a grace period where they wait for the controller to come back.

If the controller crashes (i.e., the iris_controller process stopped unexpectedly by an exception that crashes the process, a machine reboot, or a UNIX KILL signal), then workers are not proactively stopped even if the value of PersistentQueryController.stopWorkersOnShutdown is true. In this case, the dispatcher terminates those workers after the PersistentQueryController.etcdPresenceLeaseTtlSeconds timeout.

Note: irrespective of the value of the PersistentQueryController.stopWorkersOnShutdown property, if the dispatcher associated to a worker is shutdown, the worker stops.

Renamed Swing Launcher Archives

The downloadable swing launcher has been renamed as follows:
DeephavenLauncherSetup_123.exe is now deephaven-launcher-123.exe DeephavenLauncher_123.tar is now deephaven-launcher-123.tgz

Reliable Barrage table connections

We have added a new library to provide reliable Barrage subscriptions within a Deephaven Core+ cluster. The new tables monitor the state of the source query and gracefully handle disconnection and reconnections without user intervention. This can be used to create reliable meshes of Core+ workers that are fault tolerant to the loss of other queries.

When using ResolveTools, PQ URLs (pq://MyQuery/scope/MyTable?columns=MyFirstColumn,SomeOtherColumn) use these new reliable tables.

To use this library see the following examples

import io.deephaven.enterprise.remote.RemoteTableBuilder
import io.deephaven.enterprise.remote.SubscriptionOptions

// Subscribe to the columns `MyFirstColumn` and `SomeOtherColumn` of the table `MyTable` from the query `MyQuery` 
table = RemoteTableBuilder.forLocalCluster()
    .queryName("MyQuery")
    .tableName("MyTable")
    .subscribe(SubscriptionOptions.builder()
        .addIncludedColumns("MyFirstColumn", "SomeOtherColumn").build())
from deephaven_enterprise import remote_table as rt

# Subscribe to the columns `MyFirstColumn` and `SomeOtherColumn` of the table `MyTable` from the query `MyQuery`
table = rt.in_local_cluster(query_name="MyQuery", table_name="MyTable").subscribe(
    included_columns=["MyFirstColumn", "SomeOtherColumn"]
)

Connecting to remote clusters

It is also possible to connect to queries on a different Deephaven cluster.

import io.deephaven.enterprise.remote.RemoteTableBuilder

table = RemoteTableBuilder.forRemoteCluster("https://other-server.mycompany.com:8000/iris/connection.json")
        .password("user", "password")
        .queryName("MyQuery")
        .tableName("MyTable")
        .subscribe(SubscriptionOptions.builder()
                .addIncludedColumns("MyFirstColumn", "SomeOtherColumn").build())
from deephaven_enterprise import remote_table as rt

# Subscribe to the columns `MyFirstColumn` and `SomeOtherColumn` of the table `MyTable` from the query `MyQuery`
table = rt.for_remote_cluster("https://other-server.mycompany.com:8000/iris/connection.json")
    .password("username", "password") \
    .query_name("MyQuery") \
    .table_name("MyTable") \
    .subscribe(included_columns=["MyFirstColumn", "SomeOtherColumn"])