Deephaven's column types
Deephaven tables store data in strongly-typed columns. Each column in a table must have a specific type that determines what values it can hold, how much memory it uses, and how operations behave on that column's data. Understanding column types is essential for designing efficient table schemas, writing correct queries, and optimizing performance. This guide covers the column types available in Deephaven tables, how to choose the right type for your data, and how types behave in table operations.
Column type overview
Deephaven table columns support a rich type system built on Java's type system. The main column type categories are:
| Type category | Examples | Nullable | Memory efficient | Common use cases |
|---|---|---|---|---|
| Primitive numeric | byte, short, int, long, float, double | ✅ (special null values) | ✅ | Numeric calculations, counters, measurements |
| Primitive boolean/char | boolean, char | ✅ (special null values) | ✅ | Flags, single characters |
| Temporal | Instant, ZonedDateTime, LocalDate, LocalTime | ✅ | ✅ | Timestamps, dates, times, durations |
| String | String | ✅ | ⚠️ (depends on cardinality) | Text data, identifiers, categories |
| Object | BigDecimal, BigInteger, custom classes | ✅ | ❌ | High-precision math, complex data structures |
| Array | int[], double[], String[] | ✅ (array itself or elements) | ⚠️ | Lists of values, vectors, time series |
The rest of this guide explores each of these column types, their properties, and how to use them effectively in your tables.
Type fundamentals
Understanding the fundamental types — numeric primitives, booleans, characters, and strings — is essential for working effectively with Deephaven tables.
Primitive numeric types
Primitive numeric types are the most memory-efficient and performant types in Deephaven. They map directly to Java primitives and are stored unboxed in memory.
Numeric type reference
| Type | Size | Range | Null value | Example use case |
|---|---|---|---|---|
byte | 8-bit | -128 to 127 | NULL_BYTE | Small integers, status codes |
short | 16-bit | -32,768 to 32,767 | NULL_SHORT | Medium integers, year values |
int | 32-bit | -2³¹ to 2³¹-1 | NULL_INT | Standard integers, counters |
long | 64-bit | -2⁶³ to 2⁶³-1 | NULL_LONG | Large integers, IDs, nanosecond timestamps |
float | 32-bit | ~±3.4×10³⁸ (7 digits precision) | NULL_FLOAT | Approximate decimals, measurements |
double | 64-bit | ~±1.7×10³⁰⁸ (15 digits precision) | NULL_DOUBLE | High-precision decimals, financial data |
Creating numeric columns
from deephaven import empty_table
# Create columns of various numeric types
t1 = empty_table(5).update(
[
"ByteCol = (byte)(i % 100)",
"ShortCol = (short)(i * 1000)",
"IntCol = (int)(i * 1000000)",
"LongCol = (long)(i * 1000000000)",
"FloatCol = (float)(i * 1.5)",
"DoubleCol = i * 3.14159",
]
)
Null values for primitives
Unlike Java primitives, Deephaven's primitive columns can represent null values using special sentinel values:
from deephaven import empty_table
from deephaven.constants import NULL_INT, NULL_DOUBLE
# Create columns with null values
t2 = empty_table(5).update(
[
"Value = i",
"WithNulls = i % 2 == 0 ? i : NULL_INT",
"Price = i % 2 == 0 ? i * 10.5 : NULL_DOUBLE",
]
)
You can check for nulls using isNull() or comparison with the null constant:
# Filter for non-null values
t3 = t2.where("!isNull(WithNulls)")
# Alternatively, compare with null constant
t4 = t2.where("WithNulls != NULL_INT")
Type promotion and arithmetic
When performing arithmetic with mixed numeric types, Deephaven follows Java's type promotion rules:
from deephaven import empty_table
t5 = empty_table(3).update(
[
"IntValue = (int)i + 1",
"LongValue = (long)i * 1000",
# int + long promotes to long
"MixedIntLong = IntValue + LongValue",
# int * double promotes to double
"MixedIntDouble = IntValue * 3.14",
# Division with integers performs integer division
"IntDiv = 7 / 2", # Result: 3
# Use double to get floating-point division
"DoubleDiv = 7.0 / 2.0", # Result: 3.5
]
)
Primitive boolean and char
Boolean type
The boolean type represents true/false values:
from deephaven import empty_table
t6 = empty_table(5).update(["Value = i", "IsEven = i % 2 == 0", "IsPositive = i > 2"])
Char type
The char type represents a single 16-bit Unicode character:
from deephaven import empty_table
t7 = empty_table(5).update(["Index = i", "Letter = (char)('A' + i)"])
Note
char is distinct from String. A char column holds single characters, while a String column holds character sequences.
Time and dates
Deephaven provides robust support for date and time types, all based on Java 8's java.time package. These types are optimized for efficient storage and time-based operations, making them ideal for time-series data and temporal analysis.
Temporal type reference
| Type | Description | Null support | Example value |
|---|---|---|---|
Instant | Point in time (UTC) | ✅ | 2025-01-15T10:30:45.123456789Z |
ZonedDateTime | Date-time with time zone | ✅ | 2025-01-15T10:30:45-05:00[America/New_York] |
LocalDate | Date without time or zone | ✅ | 2025-01-15 |
LocalTime | Time without date or zone | ✅ | 10:30:45.123 |
LocalDateTime | Date-time without zone | ✅ | 2025-01-15T10:30:45 |
Duration | Time-based duration | ✅ | PT5H30M (5 hours, 30 minutes) |
Period | Date-based period | ✅ | P1Y2M3D (1 year, 2 months, 3 days) |
Working with Instant
Instant is the most commonly used temporal type, representing an instantaneous point on the timeline in UTC:
from deephaven import empty_table
t8 = empty_table(5).update(
[
"Timestamp = now() + i * SECOND",
"HoursLater = Timestamp + 5 * HOUR",
"DaysBefore = Timestamp - 2 * DAY",
]
)
Working with ZonedDateTime
Use ZonedDateTime when time zone information is important:
from deephaven import empty_table
t9 = empty_table(3).update(
[
"UTC = now()",
"NewYork = toZonedDateTime(UTC, timeZone(`America/New_York`))",
"Tokyo = toZonedDateTime(UTC, timeZone(`Asia/Tokyo`))",
"HourOfDay = hourOfDay(NewYork, false)",
]
)
Working with LocalDate and LocalTime
LocalDate and LocalTime are useful for date-only or time-only operations:
from deephaven import empty_table
t10 = empty_table(5).update(
[
"Timestamp = now() + i * DAY",
"DateOnly = toLocalDate(Timestamp, timeZone(`America/New_York`))",
"TimeOnly = toLocalTime(Timestamp, timeZone(`America/New_York`))",
"DayOfWeekValue = dayOfWeekValue(DateOnly)",
"IsWeekend = DayOfWeekValue == 6 || DayOfWeekValue == 7",
]
)
Temporal arithmetic
Deephaven provides constants and functions for temporal calculations:
from deephaven import empty_table
t11 = empty_table(3).update(
[
"Now = now()",
# Duration constants: SECOND, MINUTE, HOUR, DAY, WEEK
"OneMinuteLater = Now + 1 * MINUTE",
"TwoHoursEarlier = Now - 2 * HOUR",
"ThreeDaysLater = Now + 3 * DAY",
# Calculate differences
"TimeDiff = Now - (Now - 5 * HOUR)",
]
)
Time zone considerations
Important
Always be explicit about time zones when converting between Instant and zone-aware types. Implicit conversions can lead to subtle bugs, especially around daylight saving time transitions.
from deephaven import empty_table
t12 = empty_table(2).update(
[
"UTC_Time = now()",
# Explicit time zone conversion
"NY_Time = toZonedDateTime(UTC_Time, timeZone(`America/New_York`))",
"London_Time = toZonedDateTime(UTC_Time, timeZone(`Europe/London`))",
# Extract components with time zone awareness
"NY_Hour = hourOfDay(UTC_Time, timeZone(`America/New_York`), false)",
"London_Hour = hourOfDay(UTC_Time, timeZone(`Europe/London`), false)",
]
)
String type
The String type stores text data. Deephaven automatically interns strings to optimize memory usage for low-cardinality string columns.
Creating string columns
from deephaven import empty_table
t13 = empty_table(5).update(
[
"Index = i",
"Letter = `A` + i", # String concatenation
"Formatted = `` + i + ` items`",
"Conditional = i % 2 == 0 ? `Even` : `Odd`",
]
)
String operations
String columns in Deephaven are java.lang.String objects, which means you can use standard Java String methods directly:
from deephaven import empty_table
t14 = empty_table(3).update(
[
"Text = i == 0 ? `Hello World` : (i == 1 ? `DEEPHAVEN` : ` spaces `)",
"Length = Text.length()",
"Upper = Text.toUpperCase()",
"Lower = Text.toLowerCase()",
"Trimmed = Text.trim()",
"Substring = Text.substring(0, 5)",
"Contains = Text.contains(`e`)",
]
)
String interning and memory
Deephaven automatically interns strings, which means identical string values share the same memory location. This is very efficient for low-cardinality columns (like categories or symbols) but less beneficial for high-cardinality data (like unique IDs or free-form text).
from deephaven import empty_table
# Low cardinality: memory efficient (only 3 unique strings stored)
t15 = empty_table(1000).update(
["Status = i % 3 == 0 ? `Active` : (i % 3 == 1 ? `Pending` : `Closed`)"]
)
# High cardinality: less efficient (many unique strings)
t16 = empty_table(1000).update(
[
"UniqueId = `ID-` + i" # 1000 unique strings
]
)
Null strings
Strings can be null, which is different from empty strings:
from deephaven import empty_table
t17 = empty_table(5).update(
[
"Index = i",
"WithNull = i % 2 == 0 ? `Valid` : (String)null",
"EmptyString = i % 2 == 0 ? `Valid` : ``",
"IsNull = isNull(WithNull)",
"IsEmpty = WithNull == ``", # This checks for empty, not null
]
)
Advanced types
For specialized use cases, Deephaven supports object types and arrays that provide flexibility beyond primitive types.
Object types
Object types can store any Java object. Common examples include BigDecimal, BigInteger, and custom classes.
Object column considerations
- Performance: Object columns are slower than primitive columns due to boxing/unboxing overhead.
- Memory: Objects require more memory (object header + data).
- Null handling: Objects can be
null(Java null, not a sentinel value). - Use cases: High-precision arithmetic (
BigDecimal), custom data structures, complex types.
Note
For high-precision decimal arithmetic, use java.math.BigDecimal instead of double to avoid floating-point errors. However, BigDecimal operations are slower than primitive operations.
Array types
Array columns store arrays as values, allowing each cell to contain a list of items.
Creating array columns
from deephaven import empty_table
t20 = empty_table(3).update(
[
"Index = i",
# Create arrays of primitives
"IntArray = new int[]{i, i+1, i+2}",
"DoubleArray = new double[]{i * 1.5, i * 2.5, i * 3.5}",
"StringArray = new String[]{`A` + i, `B` + i, `C` + i}",
]
)
Working with array columns
Access array elements and properties:
t21 = t20.update(
[
"ArrayLength = IntArray.length",
"FirstElement = IntArray[0]",
"LastElement = IntArray[IntArray.length - 1]",
"SecondDoubleValue = DoubleArray[1]",
]
)
Array null handling
Arrays themselves can be null, and array elements can also be null (for object arrays):
from deephaven import empty_table
t22 = empty_table(5).update(
[
"Index = i",
# Null array
"NullArray = i % 2 == 0 ? new int[]{i, i+1} : (int[])null",
"IsArrayNull = isNull(NullArray)",
# Array with null elements (only for object arrays)
"StringArrayWithNulls = new String[]{i == 0 ? (String)null : `Value` + i}",
]
)
Array operations
Deephaven provides functions for array manipulation:
from deephaven import empty_table
t23 = empty_table(3).update(
[
"Values = new int[]{i, i * 2, i * 3, i * 4, i * 5}",
"Sum = sum(Values)",
"Average = avg(Values)",
"Min = min(Values)",
"Max = max(Values)",
"Count = Values.length",
]
)
Type conversions and casting
Explicit casting
Use explicit casts when you need to convert between numeric types:
from deephaven import empty_table
t24 = empty_table(5).update(
[
"DoubleValue = i * 3.7",
# Explicit casts to narrow types
"AsInt = (int)DoubleValue",
"AsLong = (long)DoubleValue",
"AsByte = (byte)DoubleValue",
# Widening (no cast needed, but allowed)
"BackToDouble = (double)AsInt",
]
)
Parsing strings to numbers
Convert strings to numeric types:
from deephaven import empty_table
t25 = empty_table(3).update(
[
"StringNum = `` + (i * 100)",
"AsInt = parseInt(StringNum)",
"AsLong = parseLong(StringNum)",
"AsDouble = parseDouble(StringNum + `.5`)",
]
)
Formatting numbers as strings
Convert numbers to strings:
from deephaven import empty_table
t26 = empty_table(3).update(
[
"Value = i * 1234.567",
# Simple string conversion
"AsString = `` + Value",
# Formatted string (using Java's String.format)
"Formatted = String.format(`%.2f`, Value)",
]
)
Handling conversion errors
Always validate input before converting to avoid runtime errors. Use conditional logic to handle cases where conversion might fail:
from deephaven import empty_table
from deephaven.constants import NULL_INT
t27 = empty_table(5).update(
[
"StringValue = i % 2 == 0 ? `` + i : `invalid`",
# Use conditional logic to handle values that can't be parsed
"SafeConversion = i % 2 == 0 ? parseInt(StringValue) : NULL_INT",
]
)
Using types effectively
Understanding how types behave in table operations and following best practices ensures optimal performance and maintainability.
Aggregations and types
Different aggregation operations have type-specific behavior:
from deephaven import empty_table
source = empty_table(100).update(
[
"Group = i % 3 == 0 ? `A` : (i % 3 == 1 ? `B` : `C`)",
"IntValue = (int)i",
"DoubleValue = i * 1.5",
]
)
# Aggregations preserve or promote types
from deephaven import agg
t28 = source.agg_by(
[
agg.sum_(["SumInt = IntValue"]), # Returns long
agg.sum_(["SumDouble = DoubleValue"]), # Returns double
agg.avg(["AvgInt = IntValue"]), # Returns double
agg.count_("CountRows"), # Returns long
],
by="Group",
)
Joins and type matching
Joins require matching types in key columns:
from deephaven import empty_table
left = empty_table(5).update(["IntKey = (int)i", "LeftValue = i * 10"])
right = empty_table(5).update(["IntKey = (int)i", "RightValue = i * 100"])
# This works - types match
t29 = left.natural_join(right, on="IntKey")
Type mismatches will cause errors:
from deephaven import empty_table
left = empty_table(5).update("Key = (int)i")
right = empty_table(5).update("Key = (long)i")
# This would fail - int vs long mismatch
# t_join = left.natural_join(right, on="Key")
Cast to matching types when necessary:
from deephaven import empty_table
left = empty_table(5).update(["Key = (int)i", "LeftValue = i * 10"])
right = empty_table(5).update(["LongKey = (long)i", "RightValue = i * 100"])
# Cast to matching type
t30 = left.update("LongKey = (long)Key").natural_join(right, on="LongKey")
Sorting and type behavior
Sorting behavior varies by type:
from deephaven import empty_table
from deephaven.constants import NULL_INT
t31 = empty_table(10).update(
[
"Number = (int)(10 - i)",
"Text = `` + (char)('Z' - i)",
"Timestamp = now() - i * HOUR",
"NullableInt = i % 3 == 0 ? NULL_INT : (int)i",
]
)
# Numeric: ascending by value
sorted_numeric = t31.sort("Number")
# String: lexicographic order
sorted_string = t31.sort("Text")
# Temporal: chronological order
sorted_time = t31.sort("Timestamp")
# Nulls: appear first by default in ascending sort
sorted_with_nulls = t31.sort("NullableInt")
Choose the right type
- Integers: Use the smallest type that fits your range (
byte<short<int<long). - Decimals: Use
doublefor most cases,BigDecimalonly when exact precision is required. - Timestamps: Use
Instantfor UTC timestamps,ZonedDateTimewhen time zones matter. - Text: Use
Stringfor text data. Consider enum patterns for low-cardinality categories. - Arrays: Use when each row needs multiple related values. Consider separate columns if querying individual elements frequently.
Optimize for memory
- Prefer primitive types over objects (e.g.,
intoverInteger,doubleoverBigDecimal). - Use appropriate numeric precision (don't use
longwhenintsuffices). - Be cautious with high-cardinality strings and object columns.
- Consider string interning benefits for categorical data.
Ensure type safety
- Always validate and handle nulls explicitly.
- Use explicit casts when converting between types to make intent clear.
- Match types in join keys and comparisons.
- Test edge cases (nulls, extreme values, type boundaries).
Performance considerations
| Operation | Fast | Slow |
|---|---|---|
| Primitive arithmetic | ✅ int, long, double | ❌ BigDecimal |
| String operations | ✅ Low-cardinality strings | ❌ High-cardinality strings |
| Null checks | ✅ Primitive nulls (NULL_INT) | ❌ Complex null checking logic |
| Aggregations | ✅ Numeric primitives | ❌ Complex objects |
| Memory usage | ✅ Primitives, interned strings | ❌ Objects, large arrays |