Skip to main content

Nvidia RAPIDS for crypto price prediction

· 3 min read
DALL·E prompt: GPU graphics card floating in a river with rapids, digital art cyperpunk
Jeremiah Cheng
JJ Brosnan
A GPU-centric platform for machine learning

The biggest players in the crypto space all use AI to predict prices and manage investments. You can take advantage of this strategy as well, and it's surprisingly easy.

The GPU is a great and simple way to accelerate your machine learning workflows.

This is the third of a six-part blog series on real-time crypto price predictions with AI. In this blog, I'll build a linear regression model using Nvidia RAPIDS to predict crypto prices. The model is basic, but is built with the best tool for leveraging CUDA-enabled GPUs. Keep up with the series:

  1. Acquire up-to-date crypto data with Apache Airflow
  2. Implement real-time AI with TensorFlow
  3. Implement real-time AI with Nvidia RAPIDS
  4. Test the models on simulated real-time data
  5. Implement the models on real-time crypto data from Coinbase
  6. Share AI predictions with URIs

We've seen how an LSTM model trains using TensorFlow. But I've got a CUDA compatible GPU. So why not use Nvidia's platform to do the work? This time around, I'll use Nvidia RAPIDS to build the LSTM.

Nvida Rapids

Instructions for installing Nvidia RAPIDS on Windows with WSL can be found here. It runs the entire data science training pipeline on GPUs, which can accelerate your workflows and reduce execution time.

Install and import required packages

We'll be using cuml for this application. The module contains a suite of libraries that implement machine learning algorithms on GPUs without heavy reliance on an understanding of CUDA programming.

General instructions for getting started with RAPIDS can be found here.

Package imports
from deephaven_server import Server
s = Server(port=10_000, jvm_args=["-Xmx4g"])
s.start()

from deephaven import ugp
ugp.auto_locking = True
from deephaven.learn import gather
from deephaven.parquet import read
from deephaven import learn

from cuml.linear_model import LinearRegression
import numpy as np
import cuml
import glob
import os

Data manipulation

We have to prepare our data before we can send it to our model. In this case, we need three extra columns, each with data shifted an extra row downwards. To do so, we use special variables in a Deephaven query.

# Replace the next line with the path to your data
list_of_files = glob.glob('/mnt/c/Users/yuche/all_data/*')
latest_file = max(list_of_files, key=os.path.getctime)
btc_table = read(latest_file).reverse()

n_rows = btc_table.size

btc_table = btc_table\
.update(["Price1 = Price_[ii - 1]", "Price2 = Price_[ii - 2]", "Price3 = Price_[ii - 3]"])\
.tail_pct(0.02)

train_table = btc_table.head_pct(0.7)
test_table = btc_table.tail_pct(0.3)

Construct a model

We will construct a linear regression model. The model itself is very simple, and constructing it is easy. With Nvidia RAPIDS, it will effectively utilize the CUDA-enabled GPU during training, testing, and deployment.

linear_regression_gpu = LinearRegression()

Train the model

# Fit the linear regression model in a function
def fit_linear_model(features, target):
linear_regression_gpu.fit(features, target)

# Gather table data into a 2D NumPy array
def table_to_numpy(rows, cols):
return gather.table_to_numpy_2d(rows, cols, np_type=np.double)

# Scatter model outputs back into a Deephaven table
def scatter(data, idx):
return data[idx]

# Train the linear regression GPU model
learn.learn(
table = train_dh,
model_func = fit_linear_model,
inputs = [learn.Input(["Price1","Price2","Price3"], table_to_numpy), learn.Input("Price", table_to_numpy)],
outputs = None,
batch_size = train_dh.size
)

Sneak peek

In this blog, we built and trained a linear regression model using the GPU with Nvidia RAPIDS. In the next blog, we'll test the models on a simulated real-time data feed. Here's a sneak peek at what the output of this model looks like in real time.

img