Connect Deephaven to R Studio
R is a very powerful language for statistical computing, but R can be slow and struggle with large datasets. Deephaven and R can interoperate, allowing you to leverage both the statistical power of R and the data prowess of Deephaven. Let’s dive in.
Installation
Connecting Deephaven and R requires four steps: install R, install rJava, install the Deephaven Launcher, and run the Deephaven Launcher. Deephaven’s Launcher connects to a Deephaven instance and downloads the latest deployed software. Each time your Deephaven instance is upgraded, the Launcher needs to be run to download the latest files. If this is not done, the software on the Deephaven instance may be incompatible with the Deephaven software used by R.
- Install R for your platform. Installation files can be downloaded here.
- Install RStudio (optional). Installation files can be downloaded here.
- Install rJava.
- [Mac OS X] Install homebrew.
- [Mac OS X] Install the pcre2 library. By running brew install pcre2 on the command line.
- Start R and run install.packages('rJava').
- Install the Deephaven Launcher. If Java is not installed with the Deephaven launcher, an appropriate Java JDK must also be installed.
- Run the Deephaven Launcher to connect to a Deephaven instance and download the latest files for the deployment.
Keyfile
To gain access a Deephaven instance, R must authenticate. This is done using a key file. If you are have access to multiple Deephaven instances, you may have a different key file for each instance.
To obtain a key file, speak to your Deephaven system administrator. System administrators can find instructions for generating key files here.
Connecting R To Deephaven
First, to allow R to use Java, JAVA_HOME
must point to your JDK. From the command prompt, run R CMD javareconf
to determine what JAVA_HOME
should be set to. Then set the JAVA_HOME
value. On Windows systems with the JDK installed by the Deephaven Launcher, JAVA_HOME
should be set to C:/Program Files (x86)/Illumon/jdk
.
Sys.setenv(JAVA_HOME = '<path/to/java/home/>')
Second, R must be configured to use the Deephaven files downloaded by the Deephaven Launcher. This can be done by creating two variables. home
is your home directory. system is the name of the Deephaven instance to connect to. This is the name you gave to the Deephaven instance in the Deephaven Launcher.
home <- "/Users/aeinstein"
system <- "dh-prod-demo"
Next, R must be configured to use your keyfile to authenticate with the Deephaven instance. If your keyfile is located in your home directory and is named .priv.dh-prod-demo.base64.txt
, you would set your keyfile to:
keyfile <- sprintf("%s/.priv.dh-prod-demo.base64.txt",home)
For your query, you will set up:
workerHeapGB
- Number of gigabytes of RAM for the Deephaven query worker.javaHeapGB
- Number of gigabytes of RAM for the local Java instance connected to R.workerHost
- Deephaven host to run your query on.
For example:
workerHeapGB <- 4`
jvmHeapGB <- 2
workerHost <- "dh-prod-demo-query4.int.illumon.com"
All that is left is loading the Deephaven R library and initializing the Deephaven database connection.
source(sprintf("%s/iris/.programfiles/%s/integrations/r/irisdb.R",home,system))
idb.init(devroot = sprintf("%s/iris/.programfiles/%s/",home,system),
workspace = sprintf("%s/iris/workspaces/%s/workspaces/r/",home,system),
propfile = "iris-common.prop",
userHome = home,
keyfile = keyfile,
librarypath = sprintf("%s/iris/.programfiles/%s/java_lib",home,system),
log4jconffile = NULL,
workerHeapGB = workerHeapGB,
jvmHeapGB = jvmHeapGB,
workerHost = workerHost,
verbose = TRUE,
jvmArgs = c("-Dservice.name=iris_console",sprintf("-Ddh.config.client.bootstrap=%s/iris/.programfiles/%s/dh-config/clients",home,system)),
classpathAdditions = c(sprintf("%s/iris/.programfiles/%s/resources",home,system),sprintf("%s/iris/.programfiles/%s/java_lib",home,system)),
jvmForceInit = FALSE)
Windows users see below:
source(sprintf("%s/Appdata/Local/illumon/%s/integrations/r/irisdb.R",home,system))
idb.init(devroot = sprintf("%s/Appdata/Local/illumon/%s/",home,system),
workspace = sprintf("%s/Documents/Iris/%s/workspaces/r/",home,system),
propfile = "iris-common.prop",
userHome = home,
keyfile = keyfile,
librarypath = sprintf("%s/Appdata/Local/illumon/%s/java_lib",home,system),
log4jconffile = NULL,
workerHeapGB = workerHeapGB,
jvmHeapGB = jvmHeapGB,
workerHost = workerHost,
verbose = TRUE,
jvmArgs = c("-Dservice.name=iris_console",sprintf("-Ddh.config.client.bootstrap=%s/Appdata/Local/illumon/%s/dh-config/clients",home,system)),
classpathAdditions = c(sprintf("%s/Appdata/Local/illumon/%s/resources",home,system),sprintf("%s/Appdata/Local/illumon/%s/java_lib",home,system)),
jvmForceInit = FALSE)
Configuring R to connect to Deephaven can be tricky. If you need help, please contact Deephaven support.
Example Query
Now that you have a Deephaven connection, you can execute Deephaven queries, pull results from Deephaven into R, and push results from R into Deephaven.
# Compute x=1+2 on the Deephaven server, then pull the value back to the R client.
idb.execute("x=1+2")
x <- idb.get("x")
print(x)
# Perform a basic DB operation on the Deephaven server and pull the table back to the R client as a dataframe.
idb.execute('t = db.t("LearnDeephaven","StockTrades"); t1=t.countBy("Count","Date")')
t1 <- idb.get.df("t1")
print(t1)
# Manipulate the dataframe and then push it to the Deephaven server with name "t2"
t2 <- t1[2:3,]
print(t2)
idb.push.df("t2",t2)
# Perform a more complex DB calculation using the dataframe, and pull the result to the R client as a dataframe.
idb.execute('t3 = t.whereIn(t2,"Date").view("USym","Dollars=Last*Size").sumBy("USym").sortDescending("Dollars")')
t3 <- idb.get.df("t3")
print(t3)