Skip to main content
Version: Python

snapshot_when

snapshot_when produces an in-memory copy of a source table that adds a new snapshot when another table, the trigger table, changes.

note

The trigger table is often a time table, a special type of table that adds new rows at a regular, user-defined interval.

caution

When snapshot_when stores table history, it stores a copy of the source table for every trigger event. This means large source tables or rapidly changing trigger tables can result in large memory usage.

Syntax

source.snapshot_when(
trigger_table: Union[Table, PartitionedTableProxy],
stamp_cols: list[str],
initial: bool = False,
incremental: bool = False,
history: bool = False,
) -> PartitionedTableProxy

Parameters

ParameterTypeDescription
trigger_tableUnion[Table, PartitionedTableProxy]

The table that triggers the snapshot. This should be a ticking table, as changes in this table trigger the snapshot.

stamp_cols optionallist[str]

One or more column names to act as stamp columns. Each stamp column will be included in the final result, and will contain the value of the stamp column from the trigger table at the time of the snapshot. If only one column, a string or list can be used. If more than one column, a list must be used. The default value is None, which means that all columns from the trigger table will be appended to the source table.

initial optionalbool

Determines whether an initial snapshot is taken upon construction. The default value is False

incremental optionalbool

Determines whether the resulting table should be incremental. The default value is False. When False, the stamp column in the resulting table will always contain the latest value from the stamp column in the trigger table. This means that every single row in the resulting table will be updated each cycle. When True, only rows that have been added or updated to the source table since the last snapshot will have the latest value from the stamp column.

history optionalbool

Determines whether the resulting table should keep history. The default value is False. When True, a full snapshot of the source table and the stamp column is appended to the resulting table every time the trigger table changes. This means that the resulting table will grow very fast. When False, only rows from the source table that have changed since the last snapshot will be appended to the resulting table. If this is True, incremental and initial must be False.

caution

The stamp column(s) from the trigger table appears in the result table. If the source table has a column with the same name as the stamp column, an error will be raised. To avoid this problem, rename the stamp column in the trigger table using rename_columns.

Returns

A new table that captures a snapshot of the source table whenever the trigger table updates.

Examples

In the following example, the source table updates once every second. The trigger table updates once every five seconds. Thus, the result table updates once every five seconds. The Timestamp column in the trigger is renamed to avoid a name conflict error.

from deephaven import time_table

source = time_table("PT1S").update_view(["X = i"])
trigger = (
time_table("PT5S")
.rename_columns(["TriggerTimestamp = Timestamp"])
.update_view(["Y = Math.sin(0.1 * i)"])
)
result = source.snapshot_when(trigger_table=trigger)

img

Notice three things:

  1. stamp_cols is left blank, so every column from trigger is included in result.
  2. incremental is false, so the entire TriggerTimestamp column in result is updated every cycle and always contains the latest value from the TriggerTimestamp column in trigger.
  3. historical is false, so only updated rows from source get appended to result on each snapshot.

In the following example, the code is nearly identical to the one above it. However, in this case, the Y column is given as the stamp key. Thus, the Timestamp column in the trigger table is omitted from the result table, which avoids a name conflict error. This is an alternative to renaming the column in the trigger table.

from deephaven import time_table

source = time_table("PT1S").update_view(["X = i"])
trigger = time_table("PT5S").update_view(["Y = i"])
result = source.snapshot_when(trigger_table=trigger, stamp_cols=["Y"])

img

In the following example, history is set to True. Therefore, every row in source gets snapshotted and appended to result when trigger changes, regardless of whether source has changed or not.

from deephaven import time_table

source = time_table("PT1S").update_view(["X = i"])
trigger = time_table("PT5S").update_view(["Y = i"])
result = source.snapshot_when(trigger_table=trigger, history=True)

img

In the following example, incremental is set to True. Thus, the Y column in result only updates when corresponding rows in trigger have changed. Contrast this with the first and second examples given above.

from deephaven import time_table

source = time_table("PT1S").update_view(["X = i"])
trigger = time_table("PT5S").update_view(["Y = i"])
result = source.snapshot_when(trigger_table=trigger, stamp_cols=["Y"], incremental=True)

img