varBy
varBy returns the variance for each group. Null values are ignored.
Caution
Applying this aggregation to a column where the variance cannot be computed will result in an error. For example, the variance is not defined for a column of string values.
Syntax
Parameters
| Parameter | Type | Description |
|---|---|---|
| groupByColumns | String... | The column(s) by which to group data.
|
| groupByColumns | ColumnName... | The column(s) by which to group data.
|
| groupByColumns | Collection<String> | The column(s) by which to group data.
|
Returns
A new table containing the variance for each group.
How to calculate variance
- Find the mean of the data set. Add all data values and divide by the sample size .
- Find the squared difference from the mean for each data value. Subtract the mean from each data value and square the result.
- Find the sum of all the squared differences. The sum of squares is all the squared differences added together.
- Calculate the variance. Variance is the sum of squares divided by the number of data points. The formula for variance for a sample set of data is:
Examples
In this example, varBy returns the variance of the whole table. Because the variance cannot be computed for the string columns X and Y, these columns are dropped before applying varBy.
In this example, varBy returns the variance, as grouped by X. Because the variance cannot be computed for the string column Y, this column is dropped before applying varBy.
In this example, varBy returns the variance, as grouped by X and Y.