How do I extract data from a Deephaven table?
How can I extract data from a Deephaven table?
Deephaven's tables are a powerful tool for data analysis - but sometimes queries require data to be analyzed outside of a table. There are several ways to extract arrays and values from Deephaven tables and convert them into more traditional Python objects.
NumPy
There are two ways to extract data from Deephaven into NumPy: deephaven.numpy
and deephaven.learn
.
When converting from tables to NumPy arrays, it's recommended to use table operations to only copy the data necessary. Copying substantially large arrays to and from Deephaven and NumPy can be memory intensive and slow.
deephaven.numpy
The Deephaven Python module has a NumPy submodule that allows users to transfer table data to and from NumPy arrays with ease. The example below creates a table, extracts a column to NumPy, applies some arithmetic, and then converts that back to a second table.
from deephaven import numpy as dhnp
from deephaven import empty_table
import numpy as np
source = empty_table(20).update(formulas=["Y = randomInt(1, 100)"])
np_y = dhnp.to_numpy(source.view(formulas=["Y"]))
np_y2 = np.sort(a=np_y, axis=0)
result = dhnp.to_table(np_array=np_y2, cols=["Y2"])
- source
- result
deephaven.learn
deephaven.learn
facilitates the transfer of data to and from Deephaven tables and NumPy arrays in both static and real-time workflows. Its main application is for machine learning, but can be used to perform analysis that is difficult to implement with standard table operations. The example below smooths a noisy signal using NumPy and deephaven.learn
.
from deephaven.plot.figure import Figure
from deephaven.learn import gather
from deephaven import empty_table
from deephaven import learn
import numpy as np
source = empty_table(25).update(["X = 0.2 * i", "Y = 2 * sin(X) + randomDouble(-1, 1)"])
source_plot = Figure().plot_xy(series_name="Noisy data", t=source, x="X", y="Y").show()
def table_to_numpy_double(rows, columns):
return gather.table_to_numpy_2d(rows, columns, np_type=np.double)
def numpy_to_table(data, index):
return data[index]
def polynomial_fit(data):
x = data[:, 0]
y = data[:, 1]
poly_order = 3
z = np.polyfit(x, y, poly_order)
p = np.poly1d(z)
return p(x)
result = learn.learn(
table=source,
model_func=polynomial_fit,
inputs=[learn.Input(col_names=["X", "Y"], gather_func=table_to_numpy_double)],
outputs=[
learn.Output(col_name="Z", scatter_func=numpy_to_table, col_type="double")
],
batch_size=source.size,
)
result_plot = (
Figure().plot_xy(series_name="Smoothed data", t=result, x="X", y="Z").show()
)
- source
- result
pandas
A pandas DataFrame is similar to a Deephaven table - a column-oriented tabular data structure. So, of course, there are ways to convert to and from pandas and Deephaven. Deephaven's pandas module makes it easy.
from deephaven import pandas as dhpd
from deephaven import empty_table
import pandas as pd
source = empty_table(20).update(["X = i", "Y = randomInt(1, 100)"])
df_source = dhpd.to_pandas(table=source)
df_source = df_source[::-1]
result = dhpd.to_table(df=df_source)
- source
- result
These FAQ pages contain answers to questions about Deephaven Community Core that our users have asked in our Community Slack. If you have a question that is not in our documentation, join our Community and we'll be happy to help!