varBy
varBy
returns the variance for each group. Null values are ignored.
Applying this aggregation to a column where the variance cannot be computed will result in an error. For example, the variance is not defined for a column of string values.
Syntax
table.varBy()
table.varBy(groupByColumns...)
Parameters
Parameter  Type  Description 

groupByColumns  String...  The column(s) by which to group data.

groupByColumns  ColumnName...  The column(s) by which to group data.

groupByColumns  Collection<String>  The column(s) by which to group data.

Returns
A new table containing the variance for each group.
How to calculate variance
 Find the mean of the data set. Add all data values and divide by the sample size $n$.
 Find the squared difference from the mean for each data value. Subtract the mean from each data value and square the result.
 Find the sum of all the squared differences. The sum of squares is all the squared differences added together.
 Calculate the variance. Variance is the sum of squares divided by the number of data points. The formula for variance for a sample set of data is:
Examples
In this example, varBy
returns the variance of the whole table. Because the variance cannot be computed for the string columns X
and Y
, these columns are dropped before applying varBy
.
source = newTable(
stringCol("X", "A", "B", "A", "C", "B", "A", "B", "B", "C"),
stringCol("Y", "M", "N", "O", "N", "P", "M", "O", "P", "M"),
intCol("Number", 55, 76, 20, 130, 230, 50, 73, 137, 214),
)
result = source.dropColumns("X", "Y").varBy()
 source
 result
In this example, varBy
returns the variance, as grouped by X
. Because the variance cannot be computed for the string column Y
, this column is dropped before applying varBy
.
source = newTable(
stringCol("X", "A", "B", "A", "C", "B", "A", "B", "B", "C"),
stringCol("Y", "M", "N", "O", "N", "P", "M", "O", "P", "M"),
intCol("Number", 55, 76, 20, 130, 230, 50, 73, 137, 214),
)
result = source.dropColumns("Y").varBy("X")
 source
 result
In this example, varBy
returns the variance, as grouped by X
and Y
.
source = newTable(
stringCol("X", "A", "B", "A", "C", "B", "A", "B", "B", "C"),
stringCol("Y", "M", "N", "O", "N", "P", "M", "O", "P", "M"),
intCol("Number", 55, 76, 20, 130, 230, 50, 73, 137, 214),
)
result = source.varBy("X", "Y")
 source
 result