Backup and restoration
The Backup and Restore process is used to recover an existing cluster or to copy or transfer content to a different Deephaven deployment.
Four areas of configuration should be backed up. They can be used to restore an existing installation after a failure or error, and to deploy a new cluster that maintains the configuration of an old/existing cluster:
- etcd, which contains most of the Deephaven system configuration, including:
- Table data
- The
WorkspaceDatatable, which contains web dashboards, workspaces, and notebooks. - Intraday and historical data
- The
- File system configurations, generally under
/etc/sysconfig/deephaven:- Process startup configuration
- Authentication certificates and keys
- Plugins and custom JARs
MySQL/MariaDB; if used, this contains ACL information such as Deephaven accounts and permissions in thedbacl_irisdatabase. See migrating ACL storage for instructions on converting fromMySQL/MariaDBACLs to etcd ACLs.
These are the areas that cover the internal configuration of the Deephaven cluster. Of course, any NFS, directory creation, firewall, OS, and/or other changes made to configure the cluster's operation in the customer's environment are also included.
General migration process
The basic migration process is as follows:
- Install the Deephaven software on the new cluster.
- Run the Deephaven backup script on the pre-existing cluster.
- Copy backup files to the new system.
- Restore configuration on the new system.
- Migrate table data, if desired.
If IP addresses, machine names, or machine roles are different in the new cluster, relative to the one from which the backups were taken, configuration values (mainly in iris-environment.prop, routing.yaml, and maybe in tailer config XML files) need to be modified. This can be done from an etcd restore (after restoring etcd) by exporting, editing, and reimporting iris-environment.prop and routing.yaml; or, if config files have been backed up individually, it may be desirable to export the new files after the new cluster has been installed, and then diff them versus the backup files, to see what values should be changed.
After configuration has been restored or migrated to the new environment, restart all Deephaven services on all nodes:
Deephaven backup script
Deephaven provides a script that can be used to back up configuration to a set of files. To use the script, first, create a directory to hold the backups. For example:
The backup script is /usr/illumon/latest/bin/backup_deephaven.
The script should be run daily, and the resulting files should be copied to a secure remote location. Execution should be automated, such as with a cron job. Once the script has completed, back up the files to another server.
In practice, you usually run the backup script twice:
backup_deephaven -emust be run on an etcd server node so it can create the etcd snapshot.backup_deephaven -A -p -w(and related options for ACLs, Persistent Queries, andWorkspaceData) must be run on a client or application server. Running these options on an etcd server typically fails because/etc/sysconfig/deephaven/illumon.d.latest/dhconfig/clientsis not populated there.
Note
The user running the script should either be the Deephaven administration account (typically irisadmin) or have privileges to sudo to this account without a password.
Backup script parameters
The following parameters can be passed into the backup script:
-d,--directory- Specify the directory in which the backup files will be created. If this is not provided, a default directory of/etc/sysconfig/deephaven/backups/systemis used.-u,--user- Specify the user account under which the backups will be run. If this is not provided, the system's administration account (usuallyirisadmin) is used. If the script isn't invoked with this account, the account used to run the tool must be able tosudoto this account.-a,--all- Back up everything. This is equivalent to specifying all the individual backup options except--retain.-e,--etcd- Back up the entire etcd database into a snapshot file.-A,--ACLs- Back up the ACLs.-p,--persistentQueries- Back up the Persistent Queries.-w,--workspaceData- Back up theWorkspaceDatatable.-s,--schema- Back up the schemas. If there are a lot of schemas, this may take a long time, and the etcd backup may be a better option.-P,--propertyFiles- Back up the property files.-r,--routing- Back up the routing files (both the primary routing file and any configurations loaded with thedhconfig discommands).-R,--retain- Delete backups this number of days old if this backup is successful.
On a host that has both etcd access and client configuration available, this example creates backups for the etcd database, ACLs, Persistent Queries, schemas, property files, routing, and workspace data in the specified directory using the system's administration account and deletes backups that are a week old. If run every day, this will keep a rotating set of backups for a week.
In many deployments, you approximate --all by running backup_deephaven -e on an etcd server and backup_deephaven -A -p -w -s -P -r on a client or application server, as described above.
Restoration from backup
The configuration restore section discusses restoration for each configuration type.
Reverting an upgrade
Deephaven does not support downgrading after upgrading. Version upgrades may include changes to etcd data formats, configuration schema, or internal data formats that are not backward-compatible with the previous version.
If a critical issue is discovered after upgrading, the options are:
- Resolve the issue in the upgraded version — This is the recommended path. Contact Deephaven support for assistance.
- Restore from a pre-upgrade backup — The only supported path to reverting an upgrade. This returns the system to its pre-upgrade state, but discards all changes made since the backup was taken.
Caution
Switching back to an older version binary without restoring configuration from a pre-upgrade backup is not supported. Running an older version against state that has been modified by a newer version may result in data corruption, service failures, or a permanently unrecoverable state.
What a restore discards
Restoring from a pre-upgrade backup is a destructive operation. All changes made after the backup was taken are lost, including:
- Persistent Queries created or modified after the upgrade
- Schema updates
- Data routing changes
- Property file changes
- Accounts and permissions changes
- Intraday and historical data written after the upgrade, unless it was separately backed up and restored through a supported procedure
- Web UI dashboards and notebooks created or modified after the upgrade
Note
Intraday and historical data written after the upgrade may or may not be readable by the older version, depending on whether a format-breaking change occurred between versions. Data in Parquet or Iceberg format is more likely to be affected. Data in Deephaven's native format is rarely affected. Regardless of format, any schema changes made after the upgrade will be lost and may need to be recreated or reimported.
Restore process
To revert to a pre-upgrade state:
- Stop all Deephaven services on all nodes.
- Reinstall the previous version of Deephaven.
- Restore etcd, configuration, and ACLs from the pre-upgrade backup. See Configuration properties and etcd and ACLs.
- Restart all Deephaven services on all nodes.
- Validate system functionality — test key workflows and confirm that Persistent Queries and data access are working as expected.
Caution
Reinstalling an older version on top of a newer version may be treated as an upgrade by the installer. If the installer encounters changes it cannot revert, the reinstall may fail, leaving the system in an inconsistent state. If this happens, the options are:
- Resolve the issue in the upgraded version with assistance from Deephaven support.
- Manually remove Deephaven with assistance from Deephaven support, then reinstall.
- Reformat or replace affected nodes and perform fresh installs. Before reformatting, ensure that intraday, user, and historical data files are preserved.
Communicate the scope of data loss to users before proceeding. If intraday or historical data written after the upgrade must be preserved, consult Deephaven support before attempting a revert.