EmStd
EmStd
creates an EM Std (exponential moving standard deviation) for an updateBy
table operation. The formula for an EM Std is:
Where:
- is the window size, an input parameter to the method.
- is the EM Std.
- is the current value.
- denotes the step. The current step is , and the previous step is .
Syntax
EmStd(tickDecay, pairs...)
EmStd(control, tickDecay, pairs...)
EmStd(control, timestampColumn, timeDecay, pairs...)
EmStd(control, timestampColumn, durationDecay, pairs...)
EmStd(timestampColumn, timeDecay, pairs...)
EmStd(timestampColumn, durationDecay, pairs...)
Parameters
Parameter | Type | Description |
---|---|---|
tickDecay | double | The decay rate in ticks (rows). |
pairs | String... | The input/output column name pairs. |
control | OperationControl | Defines how special cases should behave. If not given, default |
timestampColumn | String | The column in the source table to use for timestamps. |
timeDecay | long | The decay rate in nanoseconds. |
durationDecay | Duration | The decay rate in a |
Returns
An UpdateByOperation
to be used in an updateBy
table operation.
Examples
One column, no groups
The following example calculates the tick-based and time-based EM Std of the X
column, renaming the resultant column to EmStd_X
. The tick decay rate is set to 5 rows, and the time decay rate is set to 5 seconds. No grouping columns are specified, so the EM Std is calculated for all rows.
baseTime = parseInstant("2023-01-01T00:00:00 ET")
source = emptyTable(10).update("Timestamp = baseTime + i * SECOND", "Letter = (i % 2 == 0) ? `A` : `B`", "X = randomInt(0,25)")
result = source.updateBy([EmStd(5, "EmStd_Tick_X = X"), EmStd("Timestamp", 5 * SECOND, "EmStd_Time_X = X")])
- source
- result
One EM Std column, one grouping column
The following example builds on the previous by specifying Letter
as the key column. Thus, the EM Std is calculated on a per-letter basis.
baseTime = parseInstant("2023-01-01T00:00:00 ET")
source = emptyTable(10).update("Timestamp = baseTime + i * SECOND", "Letter = (i % 2 == 0) ? `A` : `B`", "X = randomInt(0,25)")
result = source.updateBy([EmStd(5, "EmStd_Tick_X = X"), EmStd("Timestamp", 5 * SECOND, "EmStd_Time_X = X")], "Letter")
- source
- result
Multiple EM Max columns, multiple grouping columns
The following example builds on the previous by calculating the EM Std of multiple columns with each UpdateByOperation
. Also, the groups are defined by unique combinations of letter and boolean in the Letter
and Truth
columns, respectively.
baseTime = parseInstant("2023-01-01T00:00:00 ET")
source = emptyTable(20).update("Timestamp = baseTime + i * SECOND", "Letter = (i % 2 == 0) ? `A` : `B`", "Truth = randomBool()", "X = randomInt(0, 25)", "Y = randomInt(0, 25)")
result = source.updateBy([EmStd(2, "EmStd_Tick_X = X", "EmStd_Tick_Y = Y"), EmStd("Timestamp", 3 * SECOND, "EmStd_Time_X = X", "EmStd_Time_Y = Y")], "Letter", "Truth")
- source
- result
Multiple UpdateByOperations
, multiple grouping columns
The following example builds on the previous by calculating the tick- and time-based EM Std of the X and Y columns using different EM Std
UpdateByOperations
. This allows each EM Std to have its own decay rate. The decay rates are reflected in the renamed resultant columns.
baseTime = parseInstant("2023-01-01T00:00:00 ET")
source = emptyTable(20).update("Timestamp = baseTime + i * SECOND", "Letter = (i % 2 == 0) ? `A` : `B`", "Truth = randomBool()", "X = randomInt(0, 25)", "Y = randomInt(0, 25)")
emstdTickX = EmStd(1, "EmStd_Tick_X_1row = X")
emstdTickY = EmStd(5, "EmStd_Tick_Y_5rows = Y")
emstdTimeX = EmStd("Timestamp", 2 * SECOND, "EmStd_Time_X_2sec = X")
emstdTimeY = EmStd("Timestamp", 4 * SECOND, "EmStd_Time_Y_4sec = Y")
result = source.updateBy([emstdTickX, emstdTickY, emstdTimeX, emstdTimeY], "Letter", "Truth")
- source
- result