stdBy
stdBy
returns the standard deviation for each group. Null values are ignored.
Applying this aggregation to a column where the standard deviation cannot be computed will result in an error. For example, the standard deviation is not defined for a column of string values.
Syntax
table.stdBy()
table.stdBy(groupByColumns...)
Parameters
Parameter | Type | Description |
---|---|---|
groupByColumns | String... | The column(s) by which to group data.
|
groupByColumns | ColumnName... | The column(s) by which to group data.
|
groupByColumns | Collection<String> | The column(s) by which to group data.
|
Returns
A new table containing the standard deviation for each group.
How to calculate standard deviation
Standard deviation is a measure of the dispersion of data values from the mean. The formula for standard deviation is the square root of the sum of squared differences from the mean divided by the size of the data set. For example:
Examples
In this example, stdBy
returns the standard deviation of the whole table. Because the standard deviation cannot be computed for the string columns X
and Y
, these columns are dropped before applying stdBy
.
source = newTable(
stringCol("X", "A", "B", "A", "C", "B", "A", "B", "B", "C"),
stringCol("Y", "M", "N", "O", "N", "P", "M", "O", "P", "M"),
intCol("Number", 55, 76, 20, 130, 230, 50, 73, 137, 214),
)
result = source.dropColumns("X", "Y").stdBy()
- source
- result
In this example, stdBy
returns the standard deviation, as grouped by X
. Because the standard deviation cannot be computed for the string column Y
, this column is dropped before applying stdBy
.
source = newTable(
stringCol("X", "A", "B", "A", "C", "B", "A", "B", "B", "C"),
stringCol("Y", "M", "N", "O", "N", "P", "M", "O", "P", "M"),
intCol("Number", 55, 76, 20, 130, 230, 50, 73, 137, 214),
)
result = source.dropColumns("Y").stdBy("X")
- source
- result
In this example, stdBy
returns the standard deviation, as grouped by X
and Y
.
source = newTable(
stringCol("X", "A", "B", "A", "C", "B", "A", "B", "B", "C"),
stringCol("Y", "M", "N", "O", "N", "P", "M", "O", "P", "M"),
intCol("Number", 55, 76, 20, 130, 230, 50, 73, 137, 214),
)
result = source.stdBy("X", "Y")
- source
- result