Backup, restoration, and migration
There are four areas of configuration that need to be backed up and restored in order to deploy a new cluster that maintains the configuration of an old/existing cluster:
- MySQL/MariaDB - If used, this will contain ACL information such as Deephaven accounts and permissions in the
dbacl_iris
database. - etcd - which contains the most of deephaven system configuration, including:
- Persistent queries
- Schemata
- Data routing yaml
- Properties files
- ACLs, if MySQL/MariaDB is not being used for this purpose
- The WorkspaceData table - which contains Web dashboards, workspaces, and notebooks.
- File system files - these are under
/etc/sysconfig/deephaven
:
- Host Config (if it has been modified to change heap settings for processes, etc):
etc/sysconfig/deephaven/illumon.iris.hostconfig
- Tailer config XML (if a custom one has been created)
- Any custom certificate files needed for authentication with external services such as LDAP:
/etc/sysconfig/deephaven/illumon.d.latest/resources/
- Any custom jars:
/etc/sysconfig/deephaven/illumon.d.latest/java_lib/
- From the auth server node,
dsakeyts.txt
to capture any key-based authentication that has been set up for users:/etc/sysconfig/deephaven/auth/
; in addition, if this will be replaced from backup on the new system,*.base64.txt
key files from/etc/sysconfig/deephaven/auth
should also be backed up. - It is a good safety measure to back up all of
/etc/sysconfig/deephaven/auth
to ensure that needed keys and passphrase files are available in case they are later needed for restore operations.
These are the areas that cover the internal configuration of the Deephaven cluster. In addition, there are of course any NFS, directory creation, firewall, OS, and/or other changes that have been made to configure the operation of the cluster in the customer's environment.
Restore versus content migration
Restore is used to recreate an existing cluster on new hardware. Content migration is instead used to transfer content from one installation to another. When migrating content, system configuration entries like property files and routing are usually not transferred, or specific properties may be transferred on a case-by-case basis.
For migration, the items that are needed, in order, are typically:
- The
dbacl_iris mysql/mariadb
database with users and permissions, or ACLs from etcd - Table schemata
- Persistent queries
- Web Workspaces and Dashboards
General restore process
The basic restore process is to first install a new cluster, and then to restore backed up configuration to this new cluster.
If IP addresses, machine names, or machine roles are different in the new cluster, relative to the one from which the backups were taken, configuration values (mainly in iris-environment
, routing.yaml
, and maybe in tailer config XML files) will need to be modified. This can be done from an etcd restore (after restoring etcd) by exporting, editing, and reimporting iris-environment.prop
and routing.yaml
; or, if config files have been backed up individually, it may be desirable to export the new files after the new cluster has been installed, and then diff them versus the backup files, to see what values should be changed.
After configuration has been restored or migrated to the new environment, all Deephaven services on all nodes should be restarted:
/usr/illumon/latest/dh_monit restart all
ACL DB backup and restore
There are two options for storage of ACL data: MySQL/MariaDB or etcd.
In systems that use MySQL/MariaDB, the dbacl_iris
database stores Deephaven logins and groups and associated permissions for Deephaven objects. This database can be backed up and restored using MySQL native backup and restore tools.
Backup with:
mysqldump --user [user_name] --password=[password] --databases dbacl_iris > [backup_file]
Restore with:
mysql --user [user_name] --password=[password] -e "drop database if exists dbacl_iris"
mysql --user [user_name] --password=[password] -e "create database dbacl_iris"
mysql --user [user_name] --password=[password] dbacl_iris < [backup_file]
Systems that use etcd for ACL data store this information in key-value pairs in the etcd data store. This entire data store can be backed up or restored with etcd tools. The details of this are covered in the section below.
In addition, the IrisDbUserMod tool, which allows command line manipulation of ACL data, can also be used to export from or import to either of the ACL storage back ends. In general, IrisDbUserMod is documented here: editing acls.
Specifically to ACL export and import with IrisDbUserMod:
Export:
sudo -u irisadmin /usr/illumon/latest/bin/iris iris_db_user_mod -export_all_acls -acl_file /tmp/my-acl-file.xml
Import:
sudo -u irisadmin /usr/illumon/latest/bin/iris iris_db_user_mod -import_all_acls -acl_file /tmp/my-acl-file.xml
ACL files for export and import can be .xml or .json. The options -overwrite_existing
or -replace_existing
will often be used with -import_all_acls
to direct the system on how to handle the data being imported with respect to any existing data. These options are covered in more detail in the editing acls topic.
etcd / properties files backup and restore
etcd can be backed up and restored using etcd's own tools.
Backup can be done with:
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/root /usr/illumon/latest/bin/etcdctl.sh snapshot save [file_to_save_to]
Restore is covered in the etcd documentation, under the topic: "Restart cluster from majority failure". For Deephaven, the etcd restore process would be:
- Install the new Deephaven cluster.
- Connect to the etcd node where restore will be processed.
- Use
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/root /usr/illumon/latest/bin/etcdctl.sh endpoint status -w table
to get a list of etcd nodes in the cluster. - Use
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/root /usr/illumon/latest/bin/etcdctl.sh move-leader
to force the local node to be the leader of the cluster. - Use
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/root /usr/illumon/latest/bin/etcdctl.sh member remove
to remove the other nodes from the cluster until only the local node remains. - Use
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/root /usr/illumon/latest/bin/etcdctl.sh snapshot restore
to restore the etcd database. - Follow the rest of the steps in the linked etcd.io article for adding the other nodes back to the cluster.
Alternatively, the data contained in etcd can be exported as separate files and imported into the new cluster.
Note
See These processes and tools are documented in detail in the core Deephaven documentation:
- Persistent Queries
- Backup with
sudo -u irisadmin /usr/illumon/latest/bin/iris controller_tool --export
- Restore with
sudo -u irisadmin /usr/illumon/latest/bin/iris controller_tool --import
- Backup with
- Schema
- Backup with
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig schema export -d <path to existing directory where schema files should be written>
- Restore with
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig schema import -d <path to directory created by export>
- Backup with
- Routing
- Backup with
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig routing export -f <routing file to write>
- Restore with
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig routing import <routing file to read>
- Backup with
- Property files
- Backup with
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export -d <pre-existing directory where property files should be written>
- Restore with
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import -f <file to import>
- Backup with
The persistent query export file gets written to: /db/TempFiles/irisadmin/controller_tool/controllerToolExport.xml
. See the full controller tool documentation.
iris-defaults.prop
should not be imported into the new system, regardless of whether performing a migration or restore. In general, properties files should not be imported during a migration.
Dashboard backup and restore
Dashboards and other Web environment data are stored in the WorkspaceData table.
It can be backed up by copying the entire /db/Intraday/DbInternal/WorkspaceData directory
structure, or by exporting data to CSV using the WorkspaceDataTool:
sudo -u irisadmin /usr/illumon/latest/bin/iris workspace_data_tool -c -t <type> -f <file to write>
The two types which need to be exported are Workspace and Dashboard; as two separate actions.
Restore is accomplished by restoring the entire /db/Intraday/DbInternal/WorkspaceData
directory structure, or by importing CSV data with the WorkspaceDataTool:
sudo -u irisadmin /usr/illumon/latest/bin/iris workspace_data_tool -rc -f <file to read>
See the full WorkspaceDataTool documentation.
If copying the WorkspaceData directory structure, ensure that ownership and permissions are preserved during the copying process.
Making data available to the new system
Environment-specific data files are stored under /db/Systems
, /db/Users
, /db/Intraday/<Namespace>
, and /db/IntradayUser
.
Typically /db/Users
and /db/Systems
are shared across nodes using NFS or some other similar type of shared storage. Attaching (or reattaching) such shared storage to the new servers will make historical and user data available to the new cluster.
/db/Intraday
has subdirectories for each namespace. The DbInternal namespace is reserved for Deephaven internal operations. This directory structure should not be copied between installs, but other namespace subdirectories should be copied to the new ingestion server to make intraday data available to the DIS and LTDS processes in the new cluster. Similarly, /db/IntradayUser
should be copied to the ingestion or user intraday server in the new cluster.
Ensure that file and directory ownership and permissions are preserved when copying data directories between servers. Note that there is installation-specific information under /db/TempFiles
which must not be copied between installations.
Deephaven backup script
Deephaven provides a script that can be used to back up most of the above items to a set of files. We recommend that this script be run every day and the resulting files be copied to a secure remote location.
To use the script, first, create a directory to hold the backups. For example:
sudo -u irisadmin mkdir /db/TempFiles/irisadmin/backup
The script should be run every day; this should be automated, such as with a cron job. Once the script has completed, back up the files to another server.
Note
The user running the script should either be the Deephaven administration account (typically irisadmin
), or have privileges to sudo
to this account without a password.
Backup script parameters
The following parameters can be passed into the backup script.
-d
,--directory
- Specify the directory in which the backup files will be created. If this is not provided, a default directory of/etc/sysconfig/deephaven/backups/system
will be used.-u
,--user
- Specify the user account under which the backups will be run. If this is not provided, the system's iris administration account (usuallyirisadmin
) will be used. If the script isn't invoked with this account, the account used to run the tool must be able tosudo
to this account.-a
,--all
- Backup everything. This is equivalent to specifying all the individual backup options except--retain
.-e
,--etcd
- Backup the entire etcd database into a snapshot file.-A
,--ACLs
- Backup the ACLs.-p
,--persistentQueries
- Backup the persistent queries.-w
,--workspaceData
- Backup the WorkspaceData table.-s
,--schema
- Backup the schemas. If there are many schemas, this may take a long time, and the etcd backup may be a better option.-P
,--propertyFiles
- Backup the property files.-r
,--routing
- Backup the routing file.-R
,--retain
- Delete backups this number of days old if this backup is successful.
This example will create backups for the etcd database, ACLs, persistent queries, schemas, property files, routing, and workspace data in the /db/backups
directory, using the system's iris administration account, and delete backups that are a week old. If run every day, this will keep a rotating set of backups for a week.
/usr/illumon/latest/bin/backup_deephaven --directory /db/backups --all --retain 7
Restoration from the script's backups
This section includes examples that can be used to restore to a target system, whether that target is a new server or the one on which the backups were created.
Note
Particularly if restoring on a different server, it is best practice to perform a backup on the target system before running the restoration commands.
ACLs
To restore the ACLs from the backup, use the following command.
Caution
This command will delete all existing ACLs on the target system and replace them with the ones from the file. For further detail on the iris_db_user_mod
tool, see the editing acls topic.
sudo -u irisadmin /usr/illumon/latest/bin/iris iris_db_user_mod -import_all_acls \
-replace_existing \
-acl_file <ACL XML file>
Persistent Queries
The simplest way to import the persistent queries to a target server (assuming that you don't want to keep any persistent queries on that server) is to delete all the persistent queries on the target system using the Query Config panel in the console or the Query Monitor tab in the web, and then run the following command:
sudo -u irisadmin /usr/illumon/latest/bin/iris controller_tool -i \
--retainSerial=keep \
--includeNonDisplayable=true \
--xmlFile=<PQ XML file>
See the full controller tool documentation for more details on additional options. In particular:
- If you're restoring to a different Deephaven installation, the
--serverOverride
option may be useful. - If you want to include temporary persistent queries, add the
--includeTemporary=true
parameter. Temporary persistent queries will be re-run when they're added. Since temporary persistent queries are typically used for one-time tasks they are not imported by default. - If you don't want to re-import the helper queries, then don't delete them and leave off the
--includeNonDisplayable=true
parameter.
Workspace Data
The workspace data table is backed up into four separate CSV files (one for each type of data contained in the table), and each requires its own import on the target system.
You may want to delete the existing data so that today's data is completely new. Don't do this if you need to keep existing workspace data on the target system.
sudo -u irisadmin /usr/illumon/latest/bin/dhctl intraday truncate --partitions DbInternal.WorkspaceData.<today's date>
sudo -u irisadmin /usr/illumon/latest/bin/dhctl intraday delete --partitions DbInternal.WorkspaceData.<today's date>
Historical data for the WorkspaceData table is typically stored in /db/Systems/DbInternal/Partitions/*/*/WorkspaceData
, which may not be writable. The exact directory layout will depend on the partitioning configured by the system administrator.
To import workspaces:
sudo -u irisadmin /usr/illumon/latest/bin/iris workspace_data_tool --readCsv --file <backup file>.workspace.csv
To import dashboards:
sudo -u irisadmin /usr/illumon/latest/bin/iris workspace_data_tool --readCsv --file <backup file>.dashboard.csv
To import draft queries:
sudo -u irisadmin /usr/illumon/latest/bin/iris workspace_data_tool --readCsv --file <backup file>.draftQuery.csv
To import dashboard overrides:
sudo -u irisadmin /usr/illumon/latest/bin/iris workspace_data_tool --readCsv --file <backup file>.dashboardOverride.csv
Once the desired steps are finished, restart the WebClientData persistent query.
Schema files
Schema files are placed into a tar file, so the first step is to create a directory to contain them and to un-tar the file on the target system. For example:
TEMP_SCHEMA_DIR=$(sudo -u irisadmin mktemp -d -t schemas.XXXXXX)
sudo -u irisadmin tar -xvf <tar file> -C ${TEMP_SCHEMA_DIR}
Then you can import the schema files. To import all the schemas in the directory:
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig schema import --directory ${TEMP_SCHEMA_DIR}
See the full dhconfig schema documentation for more details on how to import schema files. In particular:
- The
--namespace
option can restrict the imports to specific namespaces. - The
--update
option will overwrite any schemas that already exist on the target system with the versions from this directory.
Property files
Property files are placed into a tar file, so the first step is to create a directory to contain them and to un-tar the file on the target system. For example:
TEMP_PROP_DIR=$(sudo -u irisadmin mktemp -d -t props.XXXXXX)
sudo -u irisadmin tar -xvf <tar file> -C ${TEMP_PROP_DIR}
Then you can import the property files.
Note
If property files are being imported to a different server, you will need to edit them to ensure that host names are updated. To import all the property files in the directory:
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --directory ${TEMP_PROP_DIR} <list of property files to import>
Routing file
To restore routing from the backup file, use the following command specifying the routing backup file.
Note
If the routing file is being imported to a different server, it will need to be edited to update host names.
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig routing import \
-f <routing YAML file>
etcd
Follow the directions in the etcd section.