Kubernetes installation with Helm
To install Deephaven on Kubernetes, we use a Helm chart. Deephaven has tested this Helm chart on GKE, AKS and EKS (see Amazon Load Balancers). The Deephaven Helm installation has three prerequisites:
- An etcd cluster (which can be installed using a bitnami helm chart).
- An NFS server for various shared Deephaven volumes.
- A TLS certificate for the Envoy front proxy that serves web traffic.
And one optional prerequisite:
- Install cert-manager in the cluster to handle issuing certificates to allow the Deephaven services to communicate using TLS.
Each Deephaven cluster should be in its own Kubernetes namespace. The etcd installation must be in that same namespace so that we can read the root passphrase from the secret. The NFS server need not be in the same namespace, or even inside of the Kubernetes cluster, but it needs to be accessible from all the pods and have a defined set of exports.
Although this chart depends on NFS, it can be adapted to any persistent volume that provides an accessMode of ReadWriteMany
.
Note
We have chosen not to integrate the etcd installation with the installation of Deephaven at this time. By decoupling the charts, Deephaven can be installed and uninstalled while retaining the configuration.
General directory layout
When you extract the Helm archive provided to you (e.g., tar xzf ./deephaven-helm-1.20231218.432.tar.gz
), the root extracted directory
will have the following:
./docker
contains subdirectories for docker container images, each with adockerfile
, and container image support scripts../helm
contains the Helm chart and related supporting files../dh_helm
a wrapper script that automates the steps required to install or uninstall a Deephaven cluster in Kubernetes (see this section)../README.md
basic identification of these files and information about where to find this, detailed, documentation.
Within the helm
directory there are:
-
deephaven
contains the Helm chart. -
deephaven/templates
contains subdirectories with the Helm chart templates that define the Kubernetes objects (pods, services, etc) for the Deephaven installation. -
deephaven/values.yaml
contains the default values for the Helm chart. -
setupTools
contains useful scripts and YAML for manipulating the system outside of the chart. See the table below.File Description nfs-server.yaml
Creates an NFS server for use with your cluster; you can adjust volume sizes as appropriate. nfs-service.yaml
Creates a service for the NFS server; you will need to use the name in your cluster's YAML. etcdValues.yaml
Suitable Helm values file for an etcd installation. scaleAll.sh
Scale deployments up and down. The argument is the number of replicas (0 to shutdown, 1 to start). restartAll.sh
Restarts all deployments by scaling them down to 0, then up to 1. delete-preinstall.sh
Deletes preinstall hook resources (hooks are not automatically deleted when a release is uninstalled). ca-bootstrap-issuer.yaml
Used only if cert-manager is installed in the cluster. Creates a ClusterIssuer and a self-signed root CA certificate, and an Issuer in the namespace that will issue new certificates with the root CA in the certificate chain. This file may be copied and used as a template for defining an issuer that is appropriate for your cluster.
Set up your own cluster using the dh_helm
tool
The dh_helm
tool automates the process of installing, upgrading, and also uninstalling or reinstalling a Deephaven Kubernetes cluster. This utility allows Deephaven Kubernetes installs to be a one-line command rather than a series of manual steps. By default, for installations and upgrades, the tool will check whether the needed product and Core+ files are already in the needed paths, or, if they are not, will check for them in the current directory and copy them to the needed locations.
Typical installation use is to copy the Deephaven product file (deephaven-enterprise-<jdk version>-<version>.tar.gz
) and the Deephaven Core+ file (deephaven-coreplus-<version>.tar
) to the directory where dh_helm
is located, and run dh_helm
from there.
- For installations, the tool manages such steps as deploying etcd, deploying and configuring an NFS pod, creating the TLS secret, and other steps beyond the
helm install
step itself. Helm installation of Deephaven requires Docker images for the various container types used in the cluster. ThebuildAllForK8s.sh
andpushAll.sh
scripts, described here, can be used to do this manually, or--build-push-images
can be passed todh_helm
for the script to do it automatically.
Note
Docker or Docker Desktop is needed to build images. Additional components may be needed to build images on a Mac using Apple Silicon.
-
For uninstallation, the tool offers options to competely remove all DH related artifacts (PVs, secrets, certificates, etc.) as a single step. (Note that this does not remove docker container images from the local or remote registries.)
-
Reinstallation is a complete uninstall followed by a fresh installation as a single command execution. This insures the new installation is totally fresh, with no reused data or configuration.
-
For upgrades, the tool can automate steps such as running upgrade scripts or deleting the management-shell pod so it can be recreated with new standard settings.
If the --values-yaml
argument is provided, then the specified values file is passed to Helm. If --values-yaml
is not specified, then dh_helm
automatically creates the values file using helm/setupTools/my-values.tmpl
as the basis for the generated values file. In this case, if there are other customizations to be added that dh_helm
does not support, these changes should be made in the my-values.tmpl
before running dh_helm
. An error with be thrown if --values-yaml
is used along with explicit arguments for values it can contain (--image-tag
, --nfs-server
, --pv-prefix
, --etcd-release-name
, --cluster-url
, --container-registry
, --storage-class
, and whether to use the cert manager.)
The dh-helm
tool has a fairly large set of argument options. These are also detailed by running dh_helm
with no arguments, or with --help
as an argument.
Minimum required arguments (for uninstall) are:
Argument | Description |
---|---|
--namespace | Kubernetes namespace to install into / uninstall from. |
--name | Release name for the Helm installation of Deephaven (existing names can be found with helm list ). |
--etcd-release-name | Release name for the Helm installation of etcd (existing names can be found with helm list ). |
Installation, reinstallation, or upgrade also require:
Argument | Description | Notes |
---|---|---|
--dh-version | Deephaven product version such as 1.20231218.160. | |
--jdk-version | Java JDK version to use. | Must match Java version of Deephaven product and Core+ packages, and be one jdk11 or jdk17 (case sensitive). |
--coreplus-version | Core+ tar file version such as 0.28.1. | |
--container-registry | The registry in which to find images, and where built images will be pushed. | |
--tls-cert | The PEM format X509 certificate to use for the Envoy endpoint of the cluster. | Note that, for private PKI certificates, this should include the full chain. |
--tls-key | The key file that corresponds to --tls-cert . | Note that there must be no password on this file. |
Installation values yaml properties can be provided either with arguments here, or by providing a custom values yaml file:
Argument | Description | Notes |
---|---|---|
--values-yaml | Path and name of customized yaml values files. | This allows more flexibility than dh_install discrete arguments. |
--cluster-url | FQDN by which the cluster should be reachable (no https://, etc, just the full name) | Note that this FQDN, or a wildcard matching it, must be in the SAN block of the certificate. Use this or --values-yaml . |
--pv-prefix | Prefix for names of persistent volumes created by the Helm install. | One of:
|
Optional arguments | Description |
---|---|
--etcd-release-name | Release name for the Helm installation of etcd. This is optional for installations where the default name (my-etcd-release) will be used and an alternative is not provided, but it is required (either here, or from --values-yaml ) for uninstallation. |
--dry-run | Echoes commands that would have been run without actually running anything. |
--verbose | Does not suppress output from commands run; no effect when --dry-run is specified. Only one of --quiet or --verbose can be used. |
--quiet | Supresses all messages not generated in dh_helm itself - e.g. no errors or warnings from called commands; no effect when --dry-run is specified. Only one of --quiet or --verbose can be used. |
--image-tag | Tag to apply to container images when building and pushing them, or to use for reading them if images already have been built. If not provided, then the value of --dh-version will be used. |
--extra-requirements | A requirements.txt file of extra Python pacakges to install when bulding worker images. |
--build-push-images | By default, the script does not build and push needed container images, and will instead attempt to check for them already existing in the container registry. This flag has the script build and push the images, which is a quick verification when the images already exist and caching is enabled (which it is by default). |
--nocache | Disable Docker caching when building images. By default, caching is enabled. |
--skip-image-check | Skips the container registry image checks that normally occur for install and reinstall when --build-push-images is not specified. |
--create-namespace | Normally, the script checks for the specified namespace, and fails if it doesn't exist. With this option, the script attempts to create the namespace if it doesn't exist. |
--remove | When used for an installation or reinstallation, removes all objects, including etcd Helm release, PVs, and PVCs. Required when running --uninstall . |
--force | When used for an installation or reinstallation, bypasses confirmation of uninstallation and, with --remove , deletion of PVs and PVCs. |
--delete-management-shell | When used with --upgrade , deletes the management shell pod so it can be created with possibly changed properties from the new chart. This is not needed with versions after 1.20231218.053, as the pod has been replaced with a deployment. |
--delete-nfs-server | When used with --remove , deletes the nfs server deployment. |
--storage-class | Storage class name for local RWO. If deploying in a non-GKE environment set this to value appropriate for your cluster provider; e.g. 'gp2' for EKS. Default, which is suitable for GKE, is 'standard-rwo'. |
--no-cert-manager | Configures to install a cluster that does not use the Kubernetes cluster issuer for TLS between cluster services. |
No more than one of the following can be specified:
Argument | Description |
---|---|
--install | The default operation, but can be explicitly stated. |
--uninstall | By default will helm delete the Deephaven release; with --remove it will additionally uninstall etcd, NFS (if it's a pod), and delete all PVs, PVCs, and jobs. |
--reinstall | Effectively runs --uninstall , and then installs the specified version. Requires --remove , as reinstall cannot reuse existing configuration. |
--upgrade | May run upgrade scripts, if needed, and optionally delete the management pod. Passes through to helm upgrade for the cluster, which maintains existing data and configuration. |
Example dh_helm
command lines
Installation with creation of pod-based NFS server and build and push of needed images:
./dh_helm \
--install \
--namespace test2 \
--name test-k8s-324-2 \
--cluster-url test-k8s-cluster-2.int.illumon.com \
--tls-cert cus-tls/tls.crt \
--tls-key cus-tls/tls.key \
--container-registry us.gcr.io/eng/simple-containerization/test \
--dh-version 1.20231218.432 \
--jdk-version jdk17 \
--coreplus-version 0.33.6 \
--pv-prefix test-k8s-2 \
--verbose \
--install \
--create-nfs-server \
--build-push-images \
--create-namespace
Full uninstall with no confirmation prompts:
Warning
This removes all Deephaven Kubernetes cluster components from the test2
namespace. No configuration or data is retained.
./dh_helm \
--namespace test2 \
--name test-k8s-324 \
--etcd-release-name test-etc \
--uninstall \
--force \
--remove \
--namespace test2 \
--verbose \
--name test-k8s-324
An upgrade that also removes the management shell Pod as part of the upgrade process, so it can be recreated with new properties. Some patch versions of Deephaven require that the management shell be deleted prior to the upgrade, because some immutable properties of the Pod have been changed. Versions 1.20231218.053 and later use a management shell Deployment instead of a Pod. For these more recent versions the --delete-management-shell
argument is no longer needed.
./dh_helm \
--upgrade \
--namespace test2 \
--name test-k8s-324-2 \
--tls-cert cus-tls/tls.crt \
--tls-key cus-tls/tls.key \
--container-registry us.gcr.io/illumon-eng-170715/simple-containerization/test \
--cluster-url test-k8s-cluster-2.int.illumon.com \
--image-tag 1.20231218.162 \
--dh-version 1.20231218.162 \
--jdk-version jdk17 \
--coreplus-version 0.32.1 \
--pv-prefix test-k8s-2 \
--verbose \
--delete-management-shell
Set up your own cluster - manual process
-
Install
kubectl
andhelm
. The machine that you will be installing from must have both of these utilities installed.- Verify that
kubectl
is installed (and that you have connectivity to your cluster) withkubectl get ns
. - Verify that Helm is installed with
helm list
.
- Verify that
-
Create a namespace.
- You must create a namespace for your Deephaven installation using
kubectl create namespace <your-namespace>
. - Set your
kubectl config
to use that namespace withkubectl config set-context --current --namespace=<your-namespace>
.
- You must create a namespace for your Deephaven installation using
-
Build your containers. This should be done on a host with an architecture that matches the architecture of the host on which the container should be run. Building a Docker container on one platform architecture that targets another architecture is possible but outside the scope of these instructions. The
docker
directory contains the Dockerfiles, and a script to build and push them.-
Place a Deephaven installation .tar.gz file into the
deephaven_base
subdirectory. The Deephaven base image will be built using this version of the software. -
Place a Core+ worker tar into the
db_query_worker_coreplus
subdirectory. This will normally be the Core+ tar file that matches the Deephaven version used for the above installation tar.gz. -
Run
./buildAllForK8s.sh
to build the images. The script arguments are described here:Argument Description --version
The Deephaven version (e.g., 1.20231218.432
) used to select tar file.--jdk11|jdk17
Specifies which JDK version should be installed. --container-path
Path to Dockerfile directories. Defaults to current working directory. --no-cache
Disable caching. Defaults to caching. --coreplus-tar
Full name of a specific Core+ tar.gz file used to build a Core+ worker image. -
Run
./pushAll.sh <REPOSITORY> <TAG>
to push the images to your container registry. The container registry must be accessible from the Kubernetes cluster.Argument Description REPOSITORY
The repository to push the images to. If a .reporoot
file exists in the same directory as the script, the repository used is$REPOROOT/REPOSITORY
.TAG
The tag of the pushed images (e.g. latest
or1.20231218.432
).For AKS, you will have to explicitly give your cluster access to your container registry. To do this, run
az aks update -n <cluster-name> -g <resource-group> --attach-acr <container-registry-name>
. More details can be found in the AKS documentation.
-
-
Create your NFS server.
- Run the following commands to set up a nfs-server deployment and service. You may want to edit these files to
rename the deployment, service and persistent volume claim names, and most importantly the storage type, which will
be the default for your platform. If you have not edited them you will get the default configuration, which is a
service called
deephaven-nfs
with a fully qualified domain name ofdeephaven-nfs.<namespace>.svc.cluster.local
. This FQDN, or possibly the actual IP address, will be needed later when setting up your cluster'smy-values.yaml
(see the step below). Run these commands to set up the new NFS deployment:
# NOTE: Consider changing the persistent volume claim storageClass in the below yaml from 'default' # to a storage class that meets your performance requirements. If dynamic provisioning for the # storage class is not configured, you may need to pre-create a persistent volume beforehand. kubectl apply -f setupTools/nfs-server.yaml kubectl apply -f setupTools/nfs-service.yaml
An existing NFS server can be used if you have one. If you want to use an existing NFS server, it will need some directories exported. See
setupTools/setup-nfs-minimal.sh
for what is required. - Run the following commands to set up a nfs-server deployment and service. You may want to edit these files to
rename the deployment, service and persistent volume claim names, and most importantly the storage type, which will
be the default for your platform. If you have not edited them you will get the default configuration, which is a
service called
-
Set up your NFS server.
-
Run
kubectl get pods
to get the name of your NFS server Pod and confirm that it is running. -
Copy the setup script to the NFS pod by running this command, using your specific NFS pod name:
# Run 'kubectl get pods' to find your specific nfs-server pod name and use that as the copy target host in this command. kubectl cp setupTools/setup-nfs-minimal.sh <nfs-server-name>:/setup-nfs-minimal.sh
-
Run this command to execute that script, once again substituting the name of your NFS Pod:
kubectl exec <nfs-server-name> -- bash -c "export SETUP_NFS_EXPORTS=y && chmod 755 /setup-nfs-minimal.sh && /setup-nfs-minimal.sh"
-
-
Install the bitnami etcd chart.
-
The following command installs the etcd Helm chart with a Helm release name you must choose (e.g.,
etcd-deephaven
). To customize the etcd installation, copy and updatesetupTools/etcdValues.yaml
to suit your particular server.helm repo add bitnami https://charts.bitnami.com/bitnami helm install <release-name> bitnami/etcd --values setupTools/etcdValues.yaml
-
If you uninstall etcd, you must remove the persistent volumes and persistent volume claims before reinstalling. Alternatively, you can use a different etcd release name and update your
my-values.yaml
for Deephaven accordingly.kubectl delete pv,pvc -l app.kubernetes.io/instance=my-etcd-release
-
It may take a minute or two for etcd to become ready, particularly if you have replicas that need to synchronize.
$ kubectl get pods -w -l app.kubernetes.io/name=etcd NAME READY STATUS RESTARTS AGE my-etcd-release-0 1/1 Running 0 3h5m my-etcd-release-1 1/1 Running 0 3h5m my-etcd-release-2 1/1 Running 0 3h5m
You should wait until all replicas in the stateful set report
1/1
in theREADY
column before proceeding with the Deephaven installation. You can also verify the etcd installation using the instructions from the Helm notes for that release.
-
-
Install cert-manager (optional).
If you wish to run Deephaven services using TLS within the cluster, then you will need to install cert-manager. To see if cert-manager is installed on your cluster already, run
kubectl get clusterissuer
. If you see a message sayingerror: the server doesn't have a resource type "clusterissuer"
, then it is not installed.There are several ways to install cert-manager, and full instructions are provided at the cert-manager installation page. The most straightforward way is to use the default static install listed there. They also provide a helm chart that may be used to install cert-manager.
The
setupTools/ca-bootstrap-issuer.yaml
file will create a ClusterIssuer for the entire Kubernetes cluster that creates a self-signed root CA certificate, and an Issuer in your target Kubernetes namespace that will issue certificates that have the root CA in the certificate chain. You may create a new yaml file defining an Issuer configuration that is not self-signed if there is infrastructure to support it in your organization. For example, you may define an issuer that is configured to use HashiCorp Vault, or an external provider. Details for these configurations may be found in the cert-manager issuer configuration docs.If a ClusterIssuer was already present in your cluster, you can copy the second and third sections from
ca-bootstrap-issuer.yaml
(Certificate and Issuer definitions) to a new file, and update them with the names of your ClusterIssuer and namespace. Apply the new file usingkubectl apply -f
.To create the default self-signed cluster issuer, first edit
setupTools/ca-bootstrap-issuer.yaml
and replace occurrences of<your-namespace>
with your target Kubernetes namespace, then run the following command:# First edit ca-bootstrap-issuer.yaml and replace occurrences of <your-namespace> with your Kubernetes namespace kubectl apply -f setupTools/ca-bootstrap-issuer.yaml
If not using cert-manager, several self-signed certificates (without a common root CA) for Deephaven services will be generated and kept in a keystore for use by the system.
Note
You must set
certmgr.enabled
totrue
in yourmy-values.yaml
file for the cert-manager installation to be used. -
Create a TLS secret for the Kubernetes cluster. The secret must be named
deephaven-tls
and must be in the same namespace as your Deephaven installation. You must provide thetls.crt
andtls.key
files for the Web server certificate that meets the requirements specified in the Install and Upgrade Guide.kubectl create secret tls deephaven-tls --cert=tls.crt --key=tls.key
-
Create your
my-values.yaml
values override file.- This file will contain values that override the defaults defined in the Deephaven chart's
values.yaml
file. This is a new file referred to in this document asmy-values.yaml
, but the name is not significant and you may name it anything you like. Note that no changes should be made to the chart'svalues.yaml
file that already exists in thedeephaven
directory. This new file is used by helm to set properties required for, and specific to, your installation. The first five of these are required. Additionally, there are some default settings that are only applicable in a Google GKE environment and may need to be overridden if deploying in another provider:
- This file will contain values that override the defaults defined in the Deephaven chart's
Value | Comment | Description |
---|---|---|
nfs.pvPrefix | Required | Prefix for persistent volume names stored on the NFS server. This disambiguates releases as PVs are global and not namespaced. |
nfs.server | Required | Hostname address of your NFS server. Note that IP address may be required in EKS and AKS, to find it run kubectl get svc nfs-server . |
etcd.release | Required | This will be the name you choose for the etcd release with '-etcd' appended to it. Run kubectl get secret my-etcd-release and confirm there is a secret with this name. |
envoyFrontProxyUrl | Required | User facing URL of the envoy front proxy. |
image.tag | Required | This would typically be redefined to the specific Deephaven version you are installing, or perhaps 'latest'. |
global.storageClass | Recommended | Use this value as the default storage class, e.g., standard-rwo for GKE, gp2 for EKS, or managed-csi for AKS. |
dis.intradaySize | Recommended | Defaults to 10G. Set to appropriate value your target environment, or consult with Deephaven to evaluate your use case and make a recommendation. |
dis.intradayUserSize | Recommended | Defaults to 10G. Set to appropriate value your target environment, or consult with Deephaven to evaluate your use case and make a recommendation. |
dis.storageClass | Not required | May be set to a specific storage class if desired, otherwise will be set to global.storageClass. |
binlogs.storageClass | Not required | May be set to a specific storage class if desired, otherwise will be set to global.storageClass. |
management.storageClass | Not required | May be set to a specific storage class if desired, otherwise will be set to global.storageClass. |
envoy.externalDns | Not Required | If your cluster is configured your cluster with an ExternalDns provider, set to true to create a DNS record that points the envoyFrontProxyUrl to the envoy Kubernetes service. |
envoy.enableAdminPort | Not Required | If true, enables the envoy admin interface on port 8001. |
image.repositoryUrl | Not Required | The image repository url and path where the Deephaven container images are hosted. This does have a default value but would typically be redefined to something else. |
certmgr.enabled | Required with cert-manager | If using cert-manager, this must be set to true . Otherwise this may be omitted as it defaults to false . |
An example my-values.yaml
is as follows:
image:
repositoryUrl: us-central1-docker.pkg.dev/project/path # The repository and path where container images are stored.
tag: '1.20231218.432'
nfs:
pvPrefix: dhtest1
server: 'deephaven-nfs.<your-namespace>.svc.cluster.local' # Some non-GKE K8S providers will require an IP address here
root: '/exports/dhsystem/'
etcd:
release: my-etcd-release
global:
# The name of the default storage class to use as dedicated local persistent storage for pods.
# This will vary between different Kubernetes providers; standard-rwo is applicable for Google GKE.
storageClass: 'standard-rwo'
envoyFrontProxyUrl: 'deephaven.kubernetes.internal.company.com'
envoy:
externalDns: true
# Set to true if cert-manager has been installed and will be used to provide secure intra-service communications.
certmgr:
enabled: true
# If deploying in a non-GKE environment set these to values appropriate for your cluster provider, e.g. 'gp2'
# for EKS. These are not required, and will take the global.storageClass value if not present.
dis:
storageClassName: 'standard-rwo'
binlogs:
storageClassName: 'standard-rwo'
management:
storageClassName: 'standard-rwo'
-
Install the Deephaven helm chart. You are now ready to install.
- Use the following
helm install
command, substituting the names of your release andmy-values.yaml
if they are different.
helm install deephaven-helm-release-name ./deephaven/ -f my-values.yaml --debug
In this example the Helm release name for Deephaven is
deephaven-helm-release-name
, but you may select a release name of your own. The chart installation executes the preinstall hook, which configures etcd, routing, properties, initializes ACLs and more.--debug
will show progress as Helm configures the installation. - Use the following
-
Wait for all Pods to come online. After installing the chart, it will take a moment for all Pods to start up and initialize.
- Use
kubectl get pods -w
to watch the status of the Pods as they come up. There are dependecies of Deephaven services, and you will see Pods appear in anInit
state until that pod's dependent service is available. The configuration service will start first, then the auth service, then the remaining ones. The final Pods you will see are the query-server workers and a merge-server worker for the built-in queries.
- Use
-
Create a password for user
iris
.- Set a password for the iris user from the management shell Pod.
kubectl exec deploy/management-shell -- /usr/illumon/latest/bin/iris iris_db_user_mod -set_password -user iris -password $(echo -n <your-password> | base64)
-
Set or update a DNS record.
If your cluster is already configured with an external DNS provider, then the external DNS controller will provide a correctly configured hostname. If you have not configured external DNS, then the following commands will be helpful to find the IP address needed to create a DNS entry.
# Find the IP address of the Envoy service IP address: kubectl get --namespace=<yournamespace> svc envoy -o 'jsonpath={.status.loadBalancer.ingress[0].ip}' # Find the hostname of the Envoy service, which may be needed if your Kubernetes cluster ingress uses a hostname: kubectl get --namespace=<yournamespace> svc envoy -o 'jsonpath={.status.loadBalancer.ingress[0].hostname}'
Links to setup information for common providers is provided here for reference. [ GKE | EKS | AKS ]
-
Log in and start using Deephaven.
You should now be able to navigate to the host defined as the envoyFrontProxyUrl in your values override file; e.g:
https://deephaven.kubernetes.internal.company.com:8000/iriside/
Upgrading a Kubernetes cluster with Helm
The upgrade process is similar to the installation process, in that you will create an installation directory structure based on the Helm tar.gz
, and you will build images and configure the chart with a values.yaml
file. The main differences are that the prerequisites will already be in place, and helm upgrade
is used instead of helm install
. Note that the dh_helm
tool can also be used to upgrade a Deephaven Kubernetes cluster.
-
In a new directory for the new version, extract the new version's Helm archive; e.g.:
tar xzf ./deephaven-helm-1.20231218.432.tar.gz
. -
Copy the Deephaven product archive into the
deephaven_base
container directory; e.g.:cp ~/Downloads/deephaven-enterprise-jdk*-1.20231218.432.tar.gz deephaven-helm-1.20231218.432/docker/deephaven_base/
-
Copy the Deephaven Core+ product archive into the
db_query_worker_coreplus
container directory; e.g.:cp ~/Downloads/deephaven-coreplus-0.33.6-1.20231218.432-jdk17.tgz deephaven-helm-1.20231218.432/docker/db_query_worker_coreplus/
-
Copy the previously used
values.yaml
file to thehelm
directory; e.g.:cp ~/Downloads/values.yaml deephaven-helm-1.20231218.432/helm/
-
From the
docker
directory, build and push the new container images:
- Run
./buildAllForK8s.sh
to build the images.
Argument | Description |
---|---|
--version | The Deephaven version (e.g., 1.20231218.432 ) used to select the tar file. |
--jdk11|jdk17 | Specifies which JDK version should be installed. |
--container-path | Path to Dockerfile directories. Defaults to current working directory. |
--[no-]cache | Disable or enable Docker caching. Defaults to caching. |
--coreplus-tar | Full name of a specific Core+ tar.gz file used to build a Core+ worker image. |
- Run
./pushAll.sh
to push the images to your container registry. The container registry must be accessible from the Kubernetes cluster.
Argument | Description |
---|---|
REPOSITORY | The repository to push the images to. If a .reporoot file exists in the same directory as the script, the repository used is $REPOROOT/REPOSITORY . |
TAG | The tag of the pushed images (e.g. latest or 1.20231218.432 ). |
For AKS, you will have to explicitly give your cluster access to your container registry.
To do this, run az aks update -n <cluster-name> -g <resource-group> --attach-acr <container-registry-name>
. More details can be found in the AKS documentation.
Warning
Using latest
for the image tag, or not updating the image tag, will result in pods not restarting automatically after an upgrade and continuing to run with the older versions of the images, because the system uses the image tag value to detect whether it is already running the correct versions of images.
- Update the
values.yaml
file in thehelm
directory. At the least, you will likely need to update the tag value for the newly built container images (unless they are the new "latest"). If new functionality was added in the new release of Deephaven that is being installed, you may also need to add values entries to configure the new features.
Ensure that the values.yaml
includes a definition for the standard storage class. This became a requirement in build 1.20230511.248. For example:
global:
storageClass: 'standard-rwo'
-
If upgrading from a build prior to 1.20230511.248, delete the management-shell pod, since it will need to be recreated by the upgrade process:
kubectl delete pod management-shell --grace-period 1
Refer to the Version Log for more details on this change. -
From the helm directory, upgrade the Helm chart; e.g.:
helm upgrade deephaven-helm-release-name ./deephaven/ -f my-values.yaml --debug
Amazon Load Balancers
The default Amazon EKS setup uses the in-tree Kubernetes load balancer controller, which provisions Classic Load Balancers. The default classic load balancer settings terminate connections after 60 seconds of inactivity. This results in Deephaven workers being killed when their controlling connection is closed. It is possible to manually configure the timeout, but Deephaven recommends installing the Installing the AWS Load Balancer Controller add-on, which uses a Network Load Balancer. Additionally, the AWS Load Balancer Controller supports annotations for configuring the service. The complete set of annotations that are suitable for your network is beyond the scope of this document (e.g., subnet and IP allocation), but the following annotations (specified in your Deephaven values.yaml file) instruct the controller to create a suitable Network Load Balancer:
envoy:
serviceAnnotations:
service.beta.kubernetes.io/aws-load-balancer-type: 'nlb'
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: 'instance'
Manually configuring a Classic Load Balancer Timeout
When using a classic load balancer, a manual work-around is to identify the AWS load balancer that the Kubernetes system allocated and increase the connection timeout using the AWS command line tool.
To identify the load balancer, first run kubectl
to find the external name of the load balancer.
$ kubectl get svc envoy
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
envoy LoadBalancer 172.20.209.132 a89229d6c7c3a43fbba5728fb8216c64-882093713.us-east-1.elb.amazonaws.com 8000:31111/TCP,8001:30504/TCP 7d21h
In this example, the load balancer is identified by a89229d6c7c3a43fbba5728fb8216c64
. The load balancer attributes can be queried with:
$ aws elb describe-load-balancer-attributes --load-balancer-name a89229d6c7c3a43fbba5728fb8216c64
{
"LoadBalancerAttributes": {
"CrossZoneLoadBalancing": {
"Enabled": false
},
"AccessLog": {
"Enabled": false
},
"ConnectionDraining": {
"Enabled": false,
"Timeout": 300
},
"ConnectionSettings": {
"IdleTimeout": 60
},
"AdditionalAttributes": [
{
"Key": "elb.http.desyncmitigationmode",
"Value": "defensive"
}
]
}
}
To adjust the connection idle setting to 900 seconds, next run:
$ aws elb modify-load-balancer-attributes --load-balancer-name a89229d6c7c3a43fbba5728fb8216c64 --load-balancer-attributes="{\"ConnectionSettings\":{\"IdleTimeout\":900}}"
{
"LoadBalancerName": "a89229d6c7c3a43fbba5728fb8216c64",
"LoadBalancerAttributes": {
"ConnectionSettings": {
"IdleTimeout": 900
}
}
}