Use Deephaven in a GCP Linux instance
This guide will show you how to use the Google Cloud Platform (GCP) to run deephaven-core from Docker. It will show you how to launch a single VM instance, deploy the Deephaven server container to it, and then connect via your local machine.
GCP is one of the most popular cloud computing services currently available. Its popularity isn't just for external users - it powers Google's own services such as Gmail, Google Drive, and even YouTube. So why not take advantage of it yourself?
Get started
GCP has its own command line interface that is recommended for interacting with instances. For this guide, it is necessary to install and configure the CLI. Follow the instructions for your OS in Google's installation guide to get started.
Once you've installed the gcloud CLI, run the following command to give it authorized access to manage cloud services:
gcloud auth login
This will bring up your browser and have you log in to your Google account.
Create a VM instance
There are two ways you can create a GCP VM instance. The first is via your web browser. The second is via the gcloud CLI. Both are described in Google's documentation. We will use the web interface.
Name, region, and zone
We're going to name our VM dhc-how-to
, set the region
to us-central1 (Iowa)
, and the zone
to us-central1-a
. More information on regions and zones can be found here.
Machine configuration
The Machine configuration
options specify the hardware your cloud instance will use. Choices include processor cores, amount of memory, GPU, etc. It's important to consider your needs, as the hardware configurations are critical to the success of your endeavors in the cloud.
For this demo, we will be using:
e2-standard-2
machine type- 2 vCPUs
- 8 GB memory
- The default CPU type
- No GPU
- No display device
Another option is confidential computing, which encrypts your data while it's being processed. If you are working with sensitive data, this option is likely important to enable. For this guide, we will not be enabling confidential computing.
Docker Container
A Google Cloud VM can be configured to run Docker containers on startup. Since Deephaven can be launched from pre-built Docker images, that will make running Deephaven in the cloud a breeze.
Click on DEPLOY CONTAINER in the Containers option. It's here that the VM can be configured to build and run containers from pre-built Docker images without having to manually do anything on the VM itself. Deephaven has several pre-built Docker images to choose from. Your choice should depend on your needs.
{VERSION}
in the list below is the Deephaven Core version number. Version numbers can be found here. Additionally, {VERSION}
can be latest
, which will always pull the latest version number.
- Basic Python:
ghcr.io/deephaven/server:{VERSION}
- Python with NLTK:
ghcr.io/deephaven/server-nltk:{VERSION}
- Python with PyTorch:
ghcr.io/deephaven/server-pytorch:{VERSION}
- Python with SciKit-Learn:
ghcr.io/deephaven/server-sklearn:{VERSION}
- Python with TensorFlow:
ghcr.io/deephaven/server-tensorflow:{VERSION}
- Python with all of the above:
ghcr.io/deephaven/server-all-ai:{VERSION}
- Basic Groovy:
ghcr.io/deephaven/server-slim:{VERSION}
Choose your preferred image. For this guide, we'll use the basic Python image ghcr.io.deephaven/server:latest
.
We want two other options: Restart policy
and Environment variables
.
- The restart policy is defined by Docker's restart policies. We will use the
On failure
option so that the container restarts if it crashes. - In the
Environment variables
section, add one:START_OPTS
. Set-Xmx4g
to tell Deephaven to start with 4GB of memory. If you choose a VM with more memory, increase this to whatever amount suits your needs.
The image below shows the container configuration.
Remaining options
For the remaining options, we will use the defaults. These include Allow default access
, Firewall
, and Advanced options
. Take some time to review each option and ensure the default options (or otherwise) are right for your needs.
This guide will not cover persistent storage in the cloud. There are several options, including Docker data volumes and gcloud storage, for workflows that require storage of large datasets.
Create the VM
Once that's done, there are two ways to create the VM:
- Click the CREATE button at the bottom of the page.
- Click the EQUIVALENT CODE button at the bottom of the page, then either click COPY and run it in your local terminal, or click RUN IN CLOUD SHELL.
With all that done, you've got a Google VM running Deephaven!
SSH into your VM
Google Cloud takes a bit of time to create the VM. Once that's done, you can SSH into it via gcloud compute ssh
. It uses very similar syntax to ssh
. One of the nice things about the gcloud CLI is that you can find relevant commands in the Google Cloud web interface.
First, head to your list of VMs.
Then, find the VM you created in the list of VMs. Click the downward-facing arrow on the right-hand side.
From the drop-down menu, click View gcloud command.
This brings up a pop-up window where you can copy the command to SSH into your VM.
Copy that into your terminal, and you'll be there!
It's always a good idea to ensure your container is up and running via docker container ls
.
Connect to Deephaven
In the previous section, you connected to your VM to ensure that Deephaven is running, but didn't actually do anything with it. Deephaven uses a GUI, and you'll need access to the GUI in the cloud on your local machine. Thus, we need to take the previous section one step further and enable that GUI. The gcloud CLI has you covered.
This time around, we will create an SSH tunnel with port forwarding. Deephaven runs on a specific port, so if we forward that port on the VM to the local machine, we'll have access to the GUI. The command looks like this:
gcloud compute ssh --zone <ZONE> <INSTANCE> --project <PROJECT> -- -NL <LOCAL_PORT>:localhost:<HOST_PORT> &
<ZONE>
is the zone you created your VM in. For this guide, we created it inus-central1-a
.<INSTANCE>
is the name of the VM. For this guide, we called itdhc-how-to
.<PROJECT>
is the name of your gcloud project. This can be found in the URL when you go to the web interface at the end. It looks like?project=<PROJECT_NAME>
.-NL
does two things.- The first,
-N
, enables port forwarding. It's consistent withssh
. - The second,
-L
, specifies the port forwarding between the local and remote hosts.
- The first,
<LOCAL_PORT>
is whatever port you'd like to connect to Deephaven on locally. Deephaven is typically run on10000
, although it may be useful to specify a different port here if you're running a local Deephaven server on that port. For this case, we use10000
, since we're not running any local Deephaven servers on that port.<HOST_PORT>
is the port on the VM which you can connect to Deephaven from. In this case, it's10000
.&
prevents the terminal from blocking. It allows you to continue using your terminal. It's not necessary unless you don't want to create any new terminal windows or tabs for your work. You should manually kill this process when you're done with it.
With all of the configuration options we chose, the command looks like this:
gcloud compute ssh --zone us-central1-a dhc-how-to --project deephaven-oss -- -NL 10000:localhost:10000 &
With that run, head to your web browser of choice and go to localhost:10000/ide/
. The Deephaven running isn't running locally, but on a Google Cloud VM! Pretty cool.