Deephaven's column types
Deephaven tables store data in strongly-typed columns. Each column in a table must have a specific type that determines what values it can hold, how much memory it uses, and how operations behave on that column's data. Understanding column types is essential for designing efficient table schemas, writing correct queries, and optimizing performance. This guide covers the column types available in Deephaven tables, how to choose the right type for your data, and how types behave in table operations.
Column type overview
Deephaven table columns support a rich type system built on Java's type system. The main column type categories are:
| Type category | Examples | Nullable | Memory efficient | Common use cases |
|---|---|---|---|---|
| Primitive numeric | byte, short, int, long, float, double | ✅ (special null values) | ✅ | Numeric calculations, counters, measurements |
| Primitive boolean/char | boolean, char | ✅ (special null values) | ✅ | Flags, single characters |
| Temporal | Instant, ZonedDateTime, LocalDate, LocalTime | ✅ | ✅ | Timestamps, dates, times, durations |
| String | String | ✅ | ⚠️ (depends on cardinality) | Text data, identifiers, categories |
| Object | BigDecimal, BigInteger, custom classes | ✅ | ❌ | High-precision math, complex data structures |
| Array | int[], double[], String[] | ✅ (array itself or elements) | ⚠️ | Lists of values, vectors, time series |
The rest of this guide explores each of these column types, their properties, and how to use them effectively in your tables.
Type fundamentals
Understanding the fundamental types — numeric primitives, booleans, characters, and strings — is essential for working effectively with Deephaven tables.
Primitive numeric types
Primitive numeric types are the most memory-efficient and performant types in Deephaven. They map directly to Java primitives and are stored unboxed in memory.
Numeric type reference
| Type | Size | Range | Null value | Example use case |
|---|---|---|---|---|
byte | 8-bit | -128 to 127 | NULL_BYTE | Small integers, status codes |
short | 16-bit | -32,768 to 32,767 | NULL_SHORT | Medium integers, year values |
int | 32-bit | -2³¹ to 2³¹-1 | NULL_INT | Standard integers, counters |
long | 64-bit | -2⁶³ to 2⁶³-1 | NULL_LONG | Large integers, IDs, nanosecond timestamps |
float | 32-bit | ~±3.4×10³⁸ (7 digits precision) | NULL_FLOAT | Approximate decimals, measurements |
double | 64-bit | ~±1.7×10³⁰⁸ (15 digits precision) | NULL_DOUBLE | High-precision decimals, financial data |
Creating numeric columns
Null values for primitives
Unlike Java primitives, Deephaven's primitive columns can represent null values using special sentinel values:
You can check for nulls using isNull() or comparison with the null constant:
Type promotion and arithmetic
When performing arithmetic with mixed numeric types, Deephaven follows Java's type promotion rules:
Primitive boolean and char
Boolean type
The boolean type represents true/false values:
Char type
The char type represents a single 16-bit Unicode character:
Note
char is distinct from String. A char column holds single characters, while a String column holds character sequences.
Time and dates
Deephaven provides robust support for date and time types, all based on Java 8's java.time package. These types are optimized for efficient storage and time-based operations, making them ideal for time-series data and temporal analysis.
Temporal type reference
| Type | Description | Null support | Example value |
|---|---|---|---|
Instant | Point in time (UTC) | ✅ | 2025-01-15T10:30:45.123456789Z |
ZonedDateTime | Date-time with time zone | ✅ | 2025-01-15T10:30:45-05:00[America/New_York] |
LocalDate | Date without time or zone | ✅ | 2025-01-15 |
LocalTime | Time without date or zone | ✅ | 10:30:45.123 |
LocalDateTime | Date-time without zone | ✅ | 2025-01-15T10:30:45 |
Duration | Time-based duration | ✅ | PT5H30M (5 hours, 30 minutes) |
Period | Date-based period | ✅ | P1Y2M3D (1 year, 2 months, 3 days) |
Working with Instant
Instant is the most commonly used temporal type, representing an instantaneous point on the timeline in UTC:
Working with ZonedDateTime
Use ZonedDateTime when time zone information is important:
Working with LocalDate and LocalTime
LocalDate and LocalTime are useful for date-only or time-only operations:
Temporal arithmetic
Deephaven provides constants and functions for temporal calculations:
Time zone considerations
Important
Always be explicit about time zones when converting between Instant and zone-aware types. Implicit conversions can lead to subtle bugs, especially around daylight saving time transitions.
String type
The String type stores text data. Deephaven automatically interns strings to optimize memory usage for low-cardinality string columns.
Creating string columns
String operations
String columns in Deephaven are java.lang.String objects, which means you can use standard Java String methods directly:
String interning and memory
Deephaven automatically interns strings, which means identical string values share the same memory location. This is very efficient for low-cardinality columns (like categories or symbols) but less beneficial for high-cardinality data (like unique IDs or free-form text).
Null strings
Strings can be null, which is different from empty strings:
Advanced types
For specialized use cases, Deephaven supports object types and arrays that provide flexibility beyond primitive types.
Object types
Object types can store any Java object. Common examples include BigDecimal, BigInteger, and custom classes.
Object column considerations
- Performance: Object columns are slower than primitive columns due to boxing/unboxing overhead.
- Memory: Objects require more memory (object header + data).
- Null handling: Objects can be
null(Java null, not a sentinel value). - Use cases: High-precision arithmetic (
BigDecimal), custom data structures, complex types.
Note
For high-precision decimal arithmetic, use java.math.BigDecimal instead of double to avoid floating-point errors. However, BigDecimal operations are slower than primitive operations.
Array types
Array columns store arrays as values, allowing each cell to contain a list of items.
Creating array columns
Working with array columns
Access array elements and properties:
Array null handling
Arrays themselves can be null, and array elements can also be null (for object arrays):
Array operations
Deephaven provides functions for array manipulation:
Type conversions and casting
Explicit casting
Use explicit casts when you need to convert between numeric types:
Parsing strings to numbers
Convert strings to numeric types:
Formatting numbers as strings
Convert numbers to strings:
Handling conversion errors
Always validate input before converting to avoid runtime errors. Use conditional logic to handle cases where conversion might fail:
Using types effectively
Understanding how types behave in table operations and following best practices ensures optimal performance and maintainability.
Aggregations and types
Different aggregation operations have type-specific behavior:
Joins and type matching
Joins require matching types in key columns:
Type mismatches will cause errors:
Cast to matching types when necessary:
Sorting and type behavior
Sorting behavior varies by type:
Choose the right type
- Integers: Use the smallest type that fits your range (
byte<short<int<long). - Decimals: Use
doublefor most cases,BigDecimalonly when exact precision is required. - Timestamps: Use
Instantfor UTC timestamps,ZonedDateTimewhen time zones matter. - Text: Use
Stringfor text data. Consider enum patterns for low-cardinality categories. - Arrays: Use when each row needs multiple related values. Consider separate columns if querying individual elements frequently.
Optimize for memory
- Prefer primitive types over objects (e.g.,
intoverInteger,doubleoverBigDecimal). - Use appropriate numeric precision (don't use
longwhenintsuffices). - Be cautious with high-cardinality strings and object columns.
- Consider string interning benefits for categorical data.
Ensure type safety
- Always validate and handle nulls explicitly.
- Use explicit casts when converting between types to make intent clear.
- Match types in join keys and comparisons.
- Test edge cases (nulls, extreme values, type boundaries).
Performance considerations
| Operation | Fast | Slow |
|---|---|---|
| Primitive arithmetic | ✅ int, long, double | ❌ BigDecimal |
| String operations | ✅ Low-cardinality strings | ❌ High-cardinality strings |
| Null checks | ✅ Primitive nulls (NULL_INT) | ❌ Complex null checking logic |
| Aggregations | ✅ Numeric primitives | ❌ Complex objects |
| Memory usage | ✅ Primitives, interned strings | ❌ Objects, large arrays |