Exponential moving average by group with ticks as the decay unit
uby_ema_tick.Rd
Creates an exponential moving average (EMA) UpdateByOp for each column in cols
, using ticks as the decay unit.
Arguments
- decay_ticks
Numeric scalar denoting the decay rate in ticks.
- cols
String or list of strings denoting the column(s) to operate on. Can be renaming expressions, i.e. “new_col = col”. Default is to compute the exponential moving average for all non-grouping columns.
- operation_control
OperationControl that defines how special cases will behave. See
?op_control
for more information.
Details
The formula used is $$a = e^{\frac{-1}{\tau}}$$ $$\bar{x}_0 = x_0$$ $$\bar{x}_i = a*\bar{x}_{i-1} + (1-a)*x_i$$
Where:
\(\tau\) is
decay_ticks
, an input parameter to the method.\(\bar{x}_i\) is the exponential moving average of column \(X\) at step \(i\).
\(x_i\) is the current value.
\(i\) denotes the time step, ranging from \(i=1\) to \(i = n-1\), where \(n\) is the number of elements in \(X\).
This function acts on aggregation groups specified with the by
parameter of the update_by()
caller function.
The aggregation groups are defined by the unique combinations of values in the by
columns. For example,
if by = c("A", "B")
, then the aggregation groups are defined by the unique combinations of values in the
A
and B
columns.
This function, like other Deephaven uby
functions, is a generator function. That is, its output is another
function called an UpdateByOp
intended to be used in a call to update_by()
. This detail is typically
hidden from the user. However, it is important to understand this detail for debugging purposes, as the output of
a uby
function can otherwise seem unexpected.
For more information, see the vignette on uby
functions by running
vignette("update_by")
.
Examples
if (FALSE) { # \dontrun{
library(rdeephaven)
# connecting to Deephaven server
client <- Client$new("localhost:10000", auth_type = "psk", auth_token = "my_secret_token")
# create data frame, push to server, retrieve TableHandle
df <- data.frame(
timeCol = seq.POSIXt(as.POSIXct(Sys.Date()), as.POSIXct(Sys.Date() + 0.01), by = "1 sec")[1:500],
boolCol = sample(c(TRUE, FALSE), 500, TRUE),
col1 = sample(10000, size = 500, replace = TRUE),
col2 = sample(10000, size = 500, replace = TRUE),
col3 = 1:500
)
th <- client$import_table(df)
# compute 10-row exponential moving average of col1 and col2
th1 <- th$
update_by(uby_ema_tick(decay_ticks = 10, cols = c("col1Ema = col1", "col2Ema = col2")))
# compute 5-row exponential moving average of col1 and col2, grouped by boolCol
th2 <- th$
update_by(uby_ema_tick(decay_ticks = 5, cols = c("col1Ema = col1", "col2Ema = col2")), by = "boolCol")
# compute 20-row exponential moving average of col1 and col2, grouped by boolCol and parity of col3
th3 <- th$
update("col3Parity = col3 % 2")$
update_by(uby_ema_tick(decay_ticks = 20, cols = c("col1Ema = col1", "col2Ema = col2")), by = c("boolCol", "col3Parity"))
client$close()
} # }