# How to handle PyObjects in tables

This guide will cover how to deal with an `org.jpy.PyObject`

column in tables. For the sake of brevity, these columns will be called `PyObject`

s for the remainder of this guide.

A `PyObject`

is an artifact of jpy, the bi-directional Python-Java bridge that connects Deephaven's Python API to its Java backend. For more background information on Deephaven data types -- Python, Java, and jpy -- see the following links:

`PyObject`

columns should typically be avoided, as their usage will almost always result in downstream errors in queries, as well as degraded performance. This guide will present strategies to avoid creating `PyObject`

columns.

## What is a PyObject?

A `PyObject`

is a generic Java object that holds a Python object of some kind. It gets used when the engine hasn't been told enough about the type of data returned by a Python function or other Python process. The type is used because it's safe. A `PyObject`

can hold any arbitrary Python object such as a list, dictionary, int, float, etc. Unfortunately, that flexibility comes at the cost of compatibility and speed.

### Limitations

The code below produces a table with three columns using `empty_table`

. The three columns are as follows:

`X`

is 1/10th of the row index created using`i`

.`SinX`

is created using the built-in`sin`

function.`NumpySinX`

is created using NumPy's`sin`

function.

As a result, `X`

and `SinX`

are `double`

columns, and `NumpySinX`

is a `PyObject`

column.

`from deephaven import empty_table`

import numpy as np

source = empty_table(10).update(

["X = 0.1 * i", "SinX = sin(X)", "NumpySinX = np.sin(X)"]

)

source_meta = source.meta_table

- source
- source_meta

This seems fine at first. But, what if we try to calculate the difference between the `SinX`

and `NumpySinX`

columns?

`result = source.update(["Difference = SinX - NumpySinX"])`

The code raises an exception with the message `Cannot find method plus(int, org.jpy.PyObject)`

. In Java, there is no addition operator that can handle those two data types. It makes sense that this doesn't work. As stated before, a `PyObject`

is so generic that it can hold any Python data type. So, if it holds a dictionary, what is the correct way to add a dictionary and an integer together? There isn't one. This limitation extends to far more than just integer values. They are incompatible with a wide range of operations.

Thankfully, the built-in `sin`

function is always available. For operations where no built-in method exists, a typecast or a type hint can do the trick. In the example below, the `TypecastSinX`

and `TypehintSinX`

columns use those, respectively:

`from deephaven import empty_table`

import numpy as np

def np_sin_typehint(val) -> np.double:

return np.sin(val)

source = empty_table(10).update(

[

"X = 0.1 * i",

"SinX = sin(X)",

"TypecastSinX = (double)np.sin(X)",

"TypehintSinX = np_sin_typehint(X)",

]

)

source_meta = source.meta_table

- source_meta
- source

`source_meta`

shows that all four columns in `source`

are now double columns.

The rest of this guide will show how to avoid creating `PyObject`

columns in your queries.

## Scalar columns

The previous example showed how a column of `PyObject`

scalar values (e.g., integers and decimal numbers) can affect queries. There are three ways to avoid creating `PyObject`

columns full of scalar values.

### Built-in query language methods

Deephaven's query language has a large number of built-in methods that can be used in place of Python functions.

`from deephaven import empty_table`

import numpy as np

source = empty_table(10).update(

["X = 0.2 * i", "Y_PyObject = np.sin(X)", "Y_Double = sin(X)"]

)

source_meta = source.meta_table

- source
- source_meta

### Python type hints

If the query language doesn't have a function to perform a specific operation, a Python type hint will cast the result to the proper type. We recommend using NumPy data types over Python built-in types for type hints, as they have a one-to-one translation to the Java primitives Deephaven tables typically use.

`from deephaven import empty_table`

import numpy as np

def bessel(value) -> np.double:

return np.i0(value)

source = empty_table(10).update(["X = i", "Y = bessel(X)"])

source_meta = source.meta_table

- source
- source_meta

As of Deephaven Community Core v0.32.0, type hints in functions must match the data type they recieved, or an error will result. See Community Questions for more information.

### Type casts

If all else fails, an explicit typecast can be performed in the query string.

`from deephaven import empty_table`

import numpy as np

def bessel(value):

return np.i0(value)

source = empty_table(10).update(

["X = i", "Y_PyObject = bessel(X)", "Y_TypeCast = (double)Y_PyObject"]

)

source_meta = source.meta_table

- source
- source_meta

## String columns

Python functions that return string values can lead to `PyObject`

columns.

### Python type hints

Like with scalar columns, type hints work the same way. For strings, Python's built-in string type works great as the type hint.

`from deephaven import empty_table`

def str_from_num(value) -> str:

if value == 1:

return "one"

elif value == 2:

return "two"

elif value == 3:

return "three"

else:

return "Out Of Range"

source = empty_table(10).update(["X = i", "Y = str_from_num(X)"])

source_meta = source.meta_table

- source
- source_meta

As of Deephaven Community Core v0.32.0, type hints in functions must match the data type they recieved, or an error will result. See Community Questions for more information.

### Type casts

An explicit type cast in the query string works as well. You can use the abbreviated `String`

or full name `java.lang.String`

to the same effect.

`from deephaven import empty_table`

def str_from_num(value):

if value == 1:

return "one"

elif value == 2:

return "two"

elif value == 3:

return "three"

else:

return "Out Of Range"

source = empty_table(10).update(["X = i", "Y = (String)str_from_num(X)"])

source_meta = source.meta_table

- source
- source_meta

## Array columns

Typehints using typing and numpy.typing and typing are the best and most flexible ways to handle arrays of data. Alternatively, Python functions can use jpy directly to return a Java array, but the query string must cast the result to the appropriate array type.

`PyObject`

columns that store arrays of data can be a bit trickier to deal with than scalar and string columns. Thankfully, Python modules like numpy.typing and typing allow type hints to be used to return array columns of the desired type. Alternatively, jpy can be invoked directly to return a Java array, which the query engine will understand by default.

`from deephaven import empty_table`

from numpy import typing as npt

import numpy as np

import typing

import jpy

def return_py_array(idx):

return [idx, idx + 1]

def return_j_array(idx):

return jpy.array("int", [idx, idx + 1])

def array_typing(idx) -> typing.List[np.intc]:

return [idx, idx + 1]

def numpy_arr_typing(idx) -> npt.NDArray[np.intc]:

return np.array([idx, idx + 1])

source = empty_table(10).update(

[

"PyObj = return_py_array(i)",

"IntArrFromJpy = (int[])return_j_array(i)",

"IntArrFromTyping = array_typing(i)",

"IntArrFromNumPy = numpy_arr_typing(i)",

]

)

source_meta = source.meta_table

- source
- source_meta