Casting
Casting is performed in formulas to convert one data type to another. It is used in Deephaven's query language to ensure the correctness of column data types.
Usage
Scalars
For scalar values, (type)
casting casts from one type handled by Java to another.
- byte
- short
- int
- long
- float
- double
Deephaven column expressions are Java expressions with some extra features. The rules of casting and numbers are consistent with Java.
Each number type has an allotted number of bytes used to store information. Depending on your data needs, consider the data type used in your application.
Type | Bytes | Description | Example | Example |
---|---|---|---|---|
byte | 1 | signed whole numbers | -123 | 123 |
short | 2 | signed whole numbers | -30,000 | 30,000 |
int | 4 | signed whole numbers | -2,634,123 | 2,634,123 |
long | 8 | signed whole numbers | -8,293,193,496 | 8,293,193,496 |
float | 4 | signed floating point numbers | -8,293,193,496.2948293 | 8,293,193,496.2948293 |
double | 8 | signed floating point numbers | -64,123,542,927,328,293,193,496.2948293231 | 64,123,542,927,328,293,193,496.2948293231 |
Strings
Any columns that contain scalar values or string values can be converted to a String. The prefix, java.lang
, is not necessary.
Lists and Arrays
Python and Java arrays share similarities, but are not equal. The query language has a much easier time dealing with Java arrays than it does with Python arrays (lists, NumPy arrays, etc.). Thus, when a Python method would return one of these types, it's typically worthwhile to cast the output to an array type better handled by Deephaven's query engine.
Deephaven's dtypes module contains many utilities for converting Python arrays to Java arrays.
Example
Widening conversion
When operations are applied on a type of number that widen the type, the casting will automatically change.
In the following example, column A
is assigned an integer row element in the source
table. When operations are applied to that number that require more precision than integer, type allows the new columns to be casted to doubles.
from deephaven import empty_table
source = empty_table(10).update(
formulas=["A = (long)i", "B = A * sqrt(2)", "C = A / 2"]
)
result = source.meta_table
- result
- source
Manually casting
When writing queries, one might need to narrow the casting of the number type. The following example takes a number and reduces the bytes used to store that information. Since the bytes are truncated when narrowing the casting, spurious numbers will result if the number requires more bytes to hold the data.
The table below shows the minimum and maximum values for each data type.
The boundary point of each number type might be assigned unexpected values, such as null or infinity. If the data is near these boundaries, use a type that allows for more storage.
from deephaven import new_table
from deephaven.column import double_col
numbers_max = new_table(
[
double_col(
"MaxNumbers",
[
(2 - 1 / (2**52)) * (2**1023),
(2 - 1 / (2**23)) * (2**127),
(2**63) - 1,
(2**31) - 1,
(2**15) - 1,
(2**7) - 1,
],
)
]
).view(
formulas=[
"DoubleMax = (double)MaxNumbers",
"FloatMax = (float)MaxNumbers",
"LongMax = (long)MaxNumbers",
"IntMax = (int)MaxNumbers",
"ShortMax = (short)MaxNumbers",
"ByteMax = (byte)MaxNumbers",
]
)
numbers_min = new_table(
[
double_col(
"MinNumbers",
[
1 / (2**1074),
1 / (2**149),
-(2**63) + 513,
-(2**31) + 2,
-1 * (2**15) + 1,
-(2**7) + 1,
],
)
]
).view(
formulas=[
"DoubleMin = (double)MinNumbers",
"FloatMin = (float)MinNumbers",
"LongMin = (long)MinNumbers",
"IntMin = (int)MinNumbers",
"ShortMin = (short)MinNumbers ",
"ByteMin = (byte)MinNumbers ",
]
)
numbers_min_meta = numbers_min.meta_table.view(formulas=["Name", "DataType"])
numbers_max_meta = numbers_max.meta_table.view(formulas=["Name", "DataType"])
- numbers_max
- numbers_min
- numbers_min_meta
- numbers_max_meta
Casting strings
Sometimes, you must cast objects explicitly to a string type for update
operations to read the query correctly.
from deephaven import empty_table
from deephaven import agg
colors = ["Red", "Blue", "Green"]
formulas = [
"X = 0.1 * i",
"Y1 = Math.pow(X, 2)",
"Y2 = Math.sin(X)",
"Y3 = Math.cos(X)",
]
grouping_cols = ["Letter = (i % 2 == 0) ? `A` : `B`", "Color = (String)colors[i % 3]"]
source = empty_table(40).update(formulas + grouping_cols)
myagg = [
agg.formula(
formula="avg(k)",
formula_param="k",
cols=[f"AvgY{idx} = Y{idx}" for idx in range(1, 4)],
)
]
result = source.agg_by(aggs=myagg, by=["Letter", "Color"])
- source
- result
Casting arrays
The code below uses a Python function to create a table with 5 rows and 1 column named X
. X
contains arrays with randomly generated double precision numbers between 0 and 1. Without any casting, the data type of the X
column is org.jpy.PyObject
. This data type is a result of the Deephaven engine being told nothing about the Python values returned. It's a safe choice for the engine. However, this data type is not usable for many table operations, including ungrouping.
from deephaven import empty_table
import random
def create_list(length):
return_arr = [None] * length
for idx in range(length):
return_arr[idx] = random.random()
return return_arr
source = empty_table(5).update(["X = create_list(3)"])
source_meta = source.meta_table
- source
- source_meta
Trying to cast the output of create_list
to a Java double array directly will fail.
from deephaven import empty_table
import random
def create_list(length):
return_arr = [None] * length
for idx in range(length):
return_arr[idx] = random.random()
return return_arr
source = empty_table(5).update(["X = (double[])create_list(3)"])
deephaven.dtypes
can be used to cast the output of the function to a Java double array. This enables the double array cast (double[])
in the query string.
from deephaven import dtypes as dht
from deephaven import empty_table
import random
def create_list(length):
return_arr = [None] * length
for idx in range(length):
return_arr[idx] = random.random()
return dht.array(dtype=dht.double, seq=return_arr)
source = empty_table(5).update(["X = (double[])create_list(3)"])
source_meta = source.meta_table
- source
- source_meta
The source
table can now be ungrouped.