---
title: Legacy data validation
---

> [!WARNING]
> Legacy data validation is deprecated and will be removed in the next release. Deephaven recommends using [Core+ data validation](../../data-guide/validation.md) instead.

Before performing research and analyses, it is imperative to ensure you have quality data. Deephaven has built-in methods to validate the data in your tables: validation on intraday data, and validation after intraday data is merged into the historical data. In either case, validation is optional.

The following validation modes are available:

- `SIMPLE_TABLE` validates a simple leaf-node table.
- `FULL_INTRADAY` validates all intraday data for a specific column partition value (usually a date).
- `FULL_DATABASE` validates all historical data for a specific column partition value.

## Data Validation Query

Data Validation queries are used to validate data that has been loaded into Deephaven, and to delete intraday data (usually after it has been merged into the historical database). When **Data Validation** is selected, the **Persistent Query Configuration Editor** window shows the following options:

![The **Persistent Query Configuration Editor** displaying the options available when **Data Validation** is selected](../../assets/importing-data/validate.png)

- To proceed with creating a merge query, you will need to select a **DB Server** and enter the desired value for **Memory (Heap) Usage (GB)**.
- Options available in the **Show Advanced Options** section of the panel are typically not used when importing or merging data. To learn more about this section, please refer to the [Persistent Query Configuration Viewer/Editor](../../query-management/ui-queries.md#edit-persistent-queries).
  The **Access Control** tab presents a panel with the same options as all other configuration types and allows the query owner to authorize Admin and Viewer Groups for this query. For more information, please refer to [Access Control](../../sys-admin/permissions/permissions-overview.md).
- Clicking the **Scheduling** tab presents a panel with the same scheduling options as all other configuration types. For more information, please refer to [Scheduling](../../query-management/ui-queries.md#scheduling).
- Clicking the **Validate Settings** tab presents a panel with the options pertaining to merging data to a file already in Deephaven:

  ![The **Validate Settings** tab of the **Persistent Query Configuration Editor**](../../assets/importing-data/validate2.png)

### Validate Settings

- **Namespace**: The namespace for the data being merged.
- **Table**: The table for the data being merged.
  - **Partition Value Formula**: The formula needed to partition the data being merged. If a specific partition value is used it will need to be surrounded by quotes. In most cases, the previous day's data will be merged.
    - For example: `com.illumon.util.calendar.Calendars.calendar("USNYSE").previousDay(1)` - Merges the previous day's data based on the USNYSE calendar.
    - `"2017-01-01"` - Merges the data for the date "2017-01-01" (the quotes are required).
- **Validator Classes**: The list of classes to be run to validate the data. If **Schema Validation** is selected, the default schema-based validation class will be selected, in which case the validation from the table's schema will be used. If no classes are chosen, then no validation will be performed (this may be useful if the query is only being used to delete intraday data).
- **Test Type**: Determines the type of test being run. Options include the following:
  - **Simple (Intraday)** - runs a simple test against intraday data.
  - **Full (Intraday)** - runs the full test suite against intraday data.
  - **Full (Historical)** - runs the full test suite against historical data.
  - **Both (Intraday and Historical)** - runs the full test suite against both intraday and historical data.
- **Delete Intraday Data?**: When selected, the corresponding intraday data will be deleted. If a validator is selected, the intraday data will be deleted only if all validation succeeds. If no validator is selected, the intraday data will be deleted when the query runs.
  When a **Data Validation** query fails, the first validation failure exception is shown in the query panel's "ExceptionDetails" column, along with the total number of failures. Additional failures are not shown in the query panel, but must be retrieved from the text log, or from the [Process Event Log](../../sys-admin/internal-tables/process-event-log.md) for the worker that ran the query.

The following example (Core+) query retrieves the failed test case details from the Process Event Log for a specific worker. The worker name should be visible in the query panel:

```python
pelWorker = db.live_table("DbInternal", "ProcessEventLog").where(
    "Date=today()", "Process=`worker_63`", "LogEntry.contains(`FAIL`)"
)
```

```groovy
pelWorker = db.liveTable("DbInternal", "ProcessEventLog").where("Date=today()", "Process=`worker_63`", "LogEntry.contains(`FAIL`)")
```

> [!NOTE]
> A Data Validation query will delete intraday data by sending command messages to the Data Import Server's tailer service. If the import server is configured with a disabled tailer port, the alternate method described below is required.
>
> The [data deletion commands](../../data-guide/data-control-tool.md#intraday-data-deletion) will be sent to all import servers that handle the table specified in the query. This behavior can be changed by using the alternate deletion method described below, or by changing the data routing.
>
> - Create a Table Data Service in the data routing configuration that omits one or more import servers.
> - Specify that TDS with `-DDataRoutingService.tableDataService=tds_name` (where `tds_name` is the TDS you created) in the Extra JVM Arguments field on the **Settings** tab.
>
> If the Java property `-DvalidatePQ.delete.legacy=true` is specified in the Extra JVM Arguments field on the **Settings** tab of the validation query, then data is deleted directly from disk. This is only possible when the data on disk is accessible to the validation query.

## Validation Approaches

Deephaven validation methods are available in both schema-based validation and through custom-written validator classes. Schema-based validation examines a table's schema file and runs the validation options specified therein. Custom-written validators use a specified class with specific validation functions for a given table. Further details for each option are provided below.

## Schema-Based Validation

Validation can be placed directly into a table's schema file by adding a `<Validator>` element which contains the desired methods to be applied.

The following is the schema for the [`AuditEventLog`](../../sys-admin/internal-tables/audit-event-log.md) (with logging information removed):

```xml
<Table name="AuditEventLog" namespace="DbInternal" storageType="NestedPartitionedOnDisk" >
  <Partitions keyFormula="${autobalance_single}"/>

  <Column name="Date" dataType="String" columnType="Partitioning" />
  <Column name="Timestamp" dataType="DateTime"  />
  <Column name="ClientHost" dataType="String" />
  <Column name="ClientPort" dataType="int" />
  <Column name="ServerHost" dataType="String" columnType="Grouping" />
  <Column name="ServerPort" dataType="int" />
  <Column name="Process" dataType="String" />
  <Column name="AuthenticatedUser" dataType="String"  />
  <Column name="EffectiveUser" dataType="String" />
  <Column name="Namespace" dataType="String" />
  <Column name="Table" dataType="String" />
  <Column name="Id" dataType="int" />
  <Column name="Event" dataType="String" />
  <Column name="Details" dataType="String" symbolTable="None" encoding="UTF_8" />

...


</Table>
```

To add validation to a schema, add a `<Validator>` element as a child of the `<Table>` element in the .schema file.

Each element in the validator section represents one method from the `DynamicValidator` class to be run. The name of the element is the method name. Parameters to that method are passed in as attributes in the XML attribute. For example, to validate that the column Timestamp is of DBDateTime type, use:

`<assertColumnType column= "Timestamp" type="com.illumon.iris.db.tables.utils.DBDateTime"/>`

It is necessary to include the fully qualified class names.

To specify which validation mode should be run, use the `TableValidationMode` attribute:

`<assertColumnGrouped TableValidationMode="FULL_DATABASE" column="SerialNumber"/>`

> [!TIP]
> The arguments to the parameters are in quotes; e.g., `TableValidationMode="FULL_DATABASE"` rather than `TableValidationMode=FULL_DATABASE`, or `removeNull="false"` rather than `removeNull=false`.

The validator element looks like:

```xml
<Validator>
      <assertColumnTypes />

      <assertColumnGrouped column="ServerHost" />

      <assertNotNull columns="Timestamp,ServerHost,Process" />
      <assertFracNull column="Details" min="0" max="0.25" />

      <assertSize min="1" max="999999999" />
      <assertExpectedTableSize partitionsBefore="2" partitionsAfter="2" min="0.2" max="5.0" />
  </Validator>
```

The validation schema for this table tests that:

- Column data types are as listed in the schema.
- `ServerHost` is a grouping column.
- `ServerHost` and `Process` columns have no null entries.
- `Details` has no more than 25% null values.
- The number of rows is between 1 and 999999999.
- The table is not 5 times smaller or larger than expected.

### Base Validators

Base validators are useful when the same validation tests are to be run against multiple tables. To avoid unnecessary duplication, these tests should be placed into their own validation file and included in each table's schema validation section.

Let's say there are 10 time-series tables in the database. For all of these tables, we need to validate that each `Timestamp` column is ascending, and that each `Date` column has no more than 5% null values. To avoid duplicating this validation check in all 10 schemas, create a XML file called "timeseries.validator" (a regular XML file with a `.validator` extension) to use a base validator.

As in a schema, base validators are a `<Validator></Validator>` element with methods as components, just as in the schema validation explained above. The `timeseries.validator` XML file would look like the following:

```xml
<Validator>
    <assertAscending TableValidationMode="FULL_DATABASE" column="Timestamp"/>
    <assertFracNull removeNull="false" removeNaN="false" removeInf="false" column="Date" min="0.0" max="0.05"/>
</Validator>
```

To use a base validator, specify the "base" attribute of the `<Validator>` tag in the .schema file. Set the base attribute equal to the validator's name. For example, the `timeseries.validator` above is used as follows:

```xml
<Validator base="timeseries">
        ....
</Validator>
```

Validators can specify more than one base validation file. These are delimited with commas:

> [!NOTE]
> This means that base validators cannot use commas in their name.

```xml
<Validator base="timeseries,nullchecks,columntypes">
        ....
</Validator>
```

Base validators themselves can specify a base validator(s). In the above example, suppose seven of the 10 tables have a grouping column named `Sym`. We want to use the same checks as in the `timeseries.validator`, and we want to add an additional check that `Sym` is a grouping column. We create a new base validator `symgrouping.validator`, which includes `timeseries.validator`:

```xml
<Validator base="timeseries">
     <assertColumnGrouped column="Sym"/>
</Validator>
```

We then use `symgrouping.validator` as the base for the seven tables with the Sym grouping column:

```xml
<Validator base="symgrouping">
        ....
</Validator>
```

#### Base Validator File Locations

The `BaseValidators.resourcePath` configuration parameter defines the base validator file locations. This parameter typically specifies a semicolon-delimited list of directories, but the delimiter can be changed with the `SchemaConfig.resourceDelimiter` property.

When defining schema-based validation rules, it can be helpful to run them interactively in a Deephaven console script. Details on running them in production are in the next section, but running them in production requires updating and deploying schemata, while script-based development tests can be run ad hoc with no system configuration changes.

The following example shows how to run validation rules in a Deephaven console script.

```python
DynamicTest = jpy.get_type("com.illumon.iris.validation.dynamic.DynamicTest")
DynamicValidator = jpy.get_type("com.illumon.iris.validation.dynamic.DynamicValidator")
ValidationTableDescriptionFullDatabase = jpy.get_type(
    "com.illumon.iris.validation.ValidationTableDescriptionFullDatabase"
)
TableType = jpy.get_type("com.illumon.iris.db.v2.locations.TableType")
AggregateTableLocationKey = jpy.get_type(
    "com.illumon.iris.db.v2.locations.FullTableLocationKey$AggregateTableLocationKey"
)

location = AggregateTableLocationKey(
    "LearnDeephaven", "StockTrades", TableType.SYSTEM_PERMANENT, "2017-08-25"
)

vtd = ValidationTableDescriptionFullDatabase(db, location)

dt = DynamicTest("assertNotNull")
dt.parameter("columns", "Date,SecurityType,Exchange,USym,Sym,Last,Size,ExchangeId")
dt.invoke(DynamicValidator(vtd))

dt2 = DynamicTest("assertAllValuesInDistinctSet")

dt2.parameter("removeNull", "false")
dt2.parameter("removeNaN", "false")
dt2.parameter("removeInf", "false")
dt2.parameter("column", "Exchange")
dt2.parameter(
    "expectedValues",
    'new String[]{"Arca","Nasdaq","Internal","EdgX","Bats","BatsY","Nyse","EdgA","Chicago"}',
)
dt2.invoke(DynamicValidator(vtd))
```

```groovy
import com.illumon.iris.validation.dynamic.DynamicTest;
import com.illumon.iris.validation.dynamic.DynamicValidator;
import com.illumon.iris.validation.ValidationTableDescriptionFullDatabase;
import com.illumon.iris.db.v2.locations.FullTableLocationKey;
import com.illumon.iris.db.v2.locations.TableType;

FullTableLocationKey location = new FullTableLocationKey.AggregateTableLocationKey("LearnDeephaven","StockTrades",TableType.SYSTEM_PERMANENT,"2017-08-25");
ValidationTableDescriptionFullDatabase vtd = new ValidationTableDescriptionFullDatabase(db, location);

DynamicTest dt = new DynamicTest("assertNotNull");
dt.parameter("columns","Date,SecurityType,Exchange,USym,Sym,Last,Size,ExchangeId");
dt.invoke(new DynamicValidator(vtd));

DynamicTest dt2 = new DynamicTest("assertAllValuesInDistinctSet");
dt2.parameter("removeNull","false");
dt2.parameter("removeNaN","false");
dt2.parameter("removeInf","false");
dt2.parameter("column","Exchange");
dt2.parameter("expectedValues","new String[]{\"Arca\",\"Nasdaq\",\"Internal\",\"EdgX\",\"Bats\",\"BatsY\",\"Nyse\",\"EdgA\",\"Chicago\"}");
dt2.invoke(new DynamicValidator(vtd));
```

An explanation of key parts of the scripts follows:

- `AggregateTableLocationKey` is used to get the location against which the validation rule(s) will be run. It takes a namespace, a table name, and a partition value, like most Deephaven queries, but it also takes a `TableType`. The `TableType` is generally `SYSTEM_PERMANENT` or `SYSTEM_INTRADAY`, for historical and intraday tables, respectively.
- `ValidationTableDescriptionFullDatabase` takes a database object (`db`), and the location, and provides an object against which the `DynamicValidator` can be used to execute validation rules.
- `DynamicTest` is a class used to create a new validation rule object, add parameters, and execute (invoke) it.

Some parameters are optional, and some are required. Scripted use, as is explained here, provides an easy way to test whether parameters are needed to accomplish the desired validations.

Parameters are passed as two Strings. The first String is the name of the parameter to be set. The second is the value expression to assign to the parameter. The specific validation rule indicated when instantiating the `DynamicTest` object internally defines the data types of its arguments, and the passed value is compiled into a Java validation class for use in validating the table data. Most parameter values are simple single numeric or String values. For instance, the parameters `removeNull` and column in the examples above take single values. One takes Boolean and the other takes String, but there is no need to specially delimit the String because the parameter method recognizes the type of the parameter to be set and handles the String value accordingly.

Some parameters, however, take more complex values. In order for the more complex values to compile correctly, a whole Java expression must be passed to the parameter. For instance, the `expectedValues` parameter in the examples takes an Object array. To declare this array using literals requires including the form: `new Object[]{}`. (Note that in this case, `new string []{}` is also valid.) Since this whole parameter is a String, required double-quotes inside the String must be escaped with a leading backslash. Both the Object array declaration and the escaping of the double-quotes are Java language syntax elements.

### Running Schema Validation

Schema-based validation can be run in one of two ways.

1. Creating a [Data Validation persistent query](#data-validation-query).
2. Running on the command line. This is best done through [XML scripts](../importing-data/batch/xml-legacy.md).

To run your validation tests directly on the command line, call `RunDataQualityTests` with the following parameters:

```
RunDataQualityTest [LOCAL_SIMPLE|REMOTE_SIMPLE|REMOTE_FULL|REMOTE_BOTH] <validator0,...> <namespace0,...> <columnPartition0,...> <tableName> [explicitTests=<test0,test1,...>] [setExitStatus=<true|false>] [testLoggerName=testLogger] [pushClasses=<true|false>]
```

| Argument (in order) | Description                                                                          |
| ------------------- | ------------------------------------------------------------------------------------ |
| Mode                | Mode to use for executing the tests.                                                 |
| `validator0,...`    | A comma-separated list of Java paths to data validators.                             |
| `namespace0,...`    | A comma-separated list of namespaces to validate.                                    |
| `columnPartitions`  | A comma-separated list of partitioning column names to validate.                     |
| `tableName`         | Table to validate.                                                                   |
| `explicitTests`     | A comma-separated list of tests to run. If not provided, all tests are run.          |
| `setExitStatus`     | A boolean indicating if the exit status should be set to the number of failed tests. |
| `testLogger`        | A fully qualified path to a `DataQualityTestLogger`.                                 |
| `pushClasses`       | A boolean indicating if the test classes should be pushed.                           |

For example, to run validation via command line on a standard Deephaven server installation:

```
cd /usr/illumon/latest

sudo java -cp "java_lib/*:lib64/libFishCommon.so" -Dworkspace=java_lib/ -Ddevroot=. -DConfiguration.rootFile=iris-defaults.prop com.illumon.iris.validation.RunDataQualityTests LOCAL_SIMPLE com.illumon.iris.validation.TableValidationSuite DbInternal Date PersistentQueryStateLog
```

### Evaluating Results

Schema-based validation is designed to run quickly and return a one-size-fits-all result. For the examples above, results look like this:

`2020-04-30 16:55:50.723 WARN PASS: DataQualityTestCase: FracWhere(isNull(Date); 0.0 ;0.0)(table:StockTrades, column:isNull(Date), actual:0.0, min:0.0, max:0.0)`

`2020-04-30 16:55:54.042 WARN PASS: DataQualityTestCase: AllValuesInDistinctSet(table:StockTrades, column:Exchange, actual:0, min:0, max:0)`

A failing test result might look like this:

`2020-04-30 16:58:08.493 ERROR FAIL: DataQualityTestCase: AllValuesInDistinctSet(table:StockTrades, column:Exchange, actual:1, min:0, max:0)`

This indicates that there is a problem, but there is not much detail about what exactly is wrong. In fact, this result will be mixed in with a full stack trace and results from other validation rules, which may themselves have passed or also failed and generated stack traces.

To get more details about the results, it is necessary to query the ProcessEventLog and retrieve details so more WARN and ERROR output can be seen.

In the **Query Config** panel or **Query Monitor**, a successful validation run will show as **Completed**, and one that failed to validate the data will show **Error** status. In either case, there will be a host on which the validation was run and a worker name that can be used to query additional data from the system logs.

![A table showing the results of data validation queries](../../assets/importing-data/validate3.png)

```python
pel = (
    db.live_table("DbInternal", "ProcessEventLog")
    .where("Date=`2020-04-30`")
    .where("Host=`dh-prod-demo-dis.c.illumon-eng-170715.internal`")
    .where("Process=`worker_49`")
    .where("Level in `WARN`,`ERROR`")
)
```

```groovy
pel = db.liveTable("DbInternal","ProcessEventLog")
    .where("Date=`2020-04-30`")
    .where("Host=`dh-prod-demo-dis.c.illumon-eng-170715.internal`")
    .where("Process=`worker_49`")
    .where("Level in `WARN`,`ERROR`")
```

The above Core+ query allows viewing all of the validation run output in a somewhat easier-to-view format. Note that some of the data from the Query Status panel will need to be modified. The table column that corresponds to WorkerHost is called Host, and its contents may be a fully qualified domain name or IP address instead of the friendly name displayed in the UI.

![A selection of the table from the above query](../../assets/importing-data/validate4.png)

Finding more about data that caused individual tests to fail will generally require writing a specific query. For instance, if assertNotNull had failed for the Exchange column, it would simply report that there were one or more null values in this column. A query like this would be needed to show the actual rows that have unexpected null values for Exchange:

```python
nullValues = (
    db.historical_table("LearnDeephaven", "StockTrades")
    .where("Date=`2017-08-25`")
    .where("isNull(Exchange)")
)
```

```groovy
nullValues = db.historicalTable("LearnDeephaven","StockTrades")
    .where("Date=`2017-08-25`")
    .where("isNull(Exchange)")
```

Some of these investigative queries can be more complex. For example, this is the form to show violations of `assertStrictlyAscending`:

```python skip-test
temp1 = (
    db.historical_table("Sample", "Trades")
    .where("Date=`2020-04-30`")
    .update("RowIndex=ii")
)

temp2 = temp1.where("ii > 0", "TradeID <= TradeID_[ii-1]")

temp3 = temp1.where("ii < x.size()-1", "TradeID >= TradeID_[ii+1]")

outOfOrder = merge(temp2, temp3).sort("RowIndex")
```

```groovy skip-test
temp1 = db.historicalTable("Sample","Trades")
    .where("Date=`2020-04-30`")
    .update("RowIndex=ii")
temp2 = temp1
    .where("ii > 0","TradeID <= TradeID_[ii-1]")
temp3 = temp1
    .where("ii < x.size()-1","TradeID >= TradeID_[ii+1]")
outOfOrder = merge(temp2,temp3)
    .sort("RowIndex")
```

This query uses column array access to compare values to next and previous values, to retrieve rows that are out of order for the TradeID column. It does this both for rows where the TradeID is less than or equal to the one of the previous row and those where it is greater than or equal to the value in the next row. By merging these two sets of rows, it becomes easier to see which pairs of rows are causing the validation to fail. Note that unlike the other examples, this example is not based on demo data. The `LearnDeephaven` namespace does not contain any datasets appropriate for validation by `assertStrictlyAscending`.

## Writing Custom Validation Classes

Validation methods can also be run through Java by writing custom validator classes. A custom validation class is a Java class that extends `com.illumon.iris.validation.DataQualityTestCase` and implements test methods. These test methods should be public void methods, which will be discovered and run by the validation logic. The validator class has access to all the test methods described earlier.

If a test method should only be run for a particular mode, the following annotations can be used:

- `@TableValidationMode_FullDatabase`
- `@TableValidationMode_SimpleTable`

For example:

```
@TableValidationMode_FullDatabase
public void testGrouping() {
    assertColumnGrouped(table, "ClientHost");
}
```

### Custom Validation Class Example

```
public class MyTableValidator extends DataQualityTestCase {
    private final DBDateTime startTime;
    private final DBDateTime endTime;

    public MyTableValidator(final ValidationTableDescription validationTableDescription) {
        super(validationTableDescription);
        this.startTime = IngesterUtils.getDateTime(validationTableDescription.getLocation().getNamespace(), validationTableDescription.getLocation().getColumnPartition());
        this.endTime = DBTimeUtils.plus(startTime, DBTimeUtils.millisToNanos(IngesterUtils.DAY));

    }

    public void testTimesInSpecifiedRange() {
        DBDateTime oneDayBefore = DBTimeUtils.minus(startTime, DBTimeUtils.DAY);
        assertAllValuesBetween(table, "StartTime", oneDayBefore, endTime);
        assertAllValuesBetween(table, "EndTime", oneDayBefore, endTime);
    }

    @TableValidationMode_FullDatabase
    public void testGrouping() {
        assertColumnGrouped(table, "SerialNumber");
    }

}
```

A `DataQualityTestCase` takes a `ValidationTableDescription` in its constructor. A `ValidationTableDescription` gives all information needed for validation.

## Validation Methods

The following is a complete list of the available validation methods in the Deephaven validation framework.

| Method                                                                                                                                                                                                             | Description                                                                                                                                                                                                                                                                |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `assertSize(final long min, final long max)`                                                                                                                                                                       | Asserts the number of rows in the table is in the inclusive range `[min,max]`.                                                                                                                                                                                             |
| `assertColumnType(final String column, final Class type)`                                                                                                                                                          | Asserts that a column is of the expected type.                                                                                                                                                                                                                             |
| `assertColumnTypes()`                                                                                                                                                                                              | Asserts that all the column types in the data match the schema.                                                                                                                                                                                                            |
| `assertColumnGrouped(final String column)`                                                                                                                                                                         | Asserts that a column is grouped.                                                                                                                                                                                                                                          |
| `assertAllValuesEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column)`                                                                                            | Asserts that a column only contains a single value.                                                                                                                                                                                                                        |
| `assertAllValuesNotEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column)`                                                                                         | Asserts that a column does not contain repeated values.                                                                                                                                                                                                                    |
| `assertAllValuesEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object value)`                                                                        | Asserts that a column only contains a single value.                                                                                                                                                                                                                        |
| `assertAllValuesNotEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object value)`                                                                     | Asserts that a column does not contain the specified value.                                                                                                                                                                                                                |
| `assertEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column1, final String column2)`                                                                              | Asserts that all values in column1 are equal to all values in column2.                                                                                                                                                                                                     |
| `assertNotEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column1, final String column2`                                                                            | Asserts that all values in column1 are not equal to all values in column2.                                                                                                                                                                                                 |
| `assertLess(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column1, final String column2)`                                                                               | Asserts that all values in column1 are less than all values in column2.                                                                                                                                                                                                    |
| `assertLessEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column1, final String column2)`                                                                          | Asserts that all values in column1 are less than or equal to all values in column2.                                                                                                                                                                                        |
| `assertGreater(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column1, final String column2)`                                                                            | Asserts that all values in column1 are greater than all values in column2.                                                                                                                                                                                                 |
| `assertGreaterEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column1, final String column2)`                                                                       | Asserts that all values in column1 are greater than or equal to all values in column2.                                                                                                                                                                                     |
| `assertNumberDistinctValues(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String columns, final long min, final long max)`                                                     | Asserts the number of distinct values is in the inclusive range `[min,max]`.                                                                                                                                                                                               |
| `assertAllValuesInDistinctSet(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object... expectedValues)`                                                    | Asserts that all values in a column are present in a set of expected values.                                                                                                                                                                                               |
| `assertAllValuesInArrayInDistinctSet(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object... expectedValues)`                                             | Asserts that all values contained in arrays in a column are present in a set of expected values.                                                                                                                                                                           |
| `assertAllValuesInStringSetInDistinctSet(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object... expectedValues)`                                         | Asserts that all values contained in string sets in a column are present in a set of expected values.                                                                                                                                                                      |
| `assertAllValuesNotInDistinctSet(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object... values)`                                                         | Asserts that all values in a column are not present in a set of values.                                                                                                                                                                                                    |
| `assertAllValuesInArrayNotInDistinctSet(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object... values)`                                                  | Asserts that all values contained in arrays in a column are not present in a set of values.                                                                                                                                                                                |
| `assertAllValuesInStringSetNotInDistinctSet(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object... values)`                                              | Asserts that all values contained in string sets in a column are not present in a set of values.                                                                                                                                                                           |
| `assertFracWhere(final String filter, final double min, final double max)`                                                                                                                                         | Asserts the fraction of a table's rows matching the provided filter falls within a defined range.                                                                                                                                                                          |
| `assertFracNull(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final double min, final double max)`                                                              | Asserts that the fraction of NULL values is in the inclusive range `[min,max]`.                                                                                                                                                                                            |
| `assertNotNull(final String... columns)`                                                                                                                                                                           | Asserts that the list of columns contains no null values. To use in schema-based validation, place the column list in comma-delimited quotes (e.g., `<assertNotNull columns="Owner,Name,Timestamp" />`).                                                                   |
| `assertFracNan(final boolean removeNull, final boolean removeInf, final String column, final double min, final double max)`                                                                                        | Asserts that the fraction of NaN values is in the inclusive range `[min,max]`.                                                                                                                                                                                             |
| `assertFracInf(final boolean removeNull, final boolean removeNaN, final String column, final double min, final double max)`                                                                                        | Asserts that the fraction of infinite values is in the inclusive range `[min,max]`.                                                                                                                                                                                        |
| `assertFracZero(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final double min, final double max)`                                                              | Asserts that the fraction of zero values is in the inclusive range `[min,max]`.                                                                                                                                                                                            |
| `assertFracValuesBetween(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column,final Comparable minValue, final Comparable maxValue,final double min, final double max)` | Asserts that the fraction of values between `[minValue,maxValue]` is in the inclusive range `[min,max]`.                                                                                                                                                                   |
| `assertAllValuesBetween(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Comparable minValue, final Comparable maxValue)`                                    | Asserts that all values in the column are between `[minValue,maxValue]`.                                                                                                                                                                                                   |
| `assertMin(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Comparable min, final Comparable max, final String... groupByColumns)`                           | Asserts that the minimum value of the column is in the inclusive range `[min,max]`. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                                                                            |
| `assertMax(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Comparable min, final Comparable max, final String... groupByColumns)`                           | Asserts that the maximum value of the column is in the inclusive range `[min,max]`. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                                                                            |
| `assertAvg(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final double min, final double max, final String... groupByColumns)`                                   | Asserts that the average of the column is in the inclusive range `[min,max]`. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                                                                                  |
| `assertStd(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final double min, final double max, final String... groupByColumns)`                                   | Asserts that the standard deviation of the column is in the inclusive range `[min,max]`. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                                                                       |
| `assertPercentile(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final double percentile, final double min, final double max, final String... groupByColumns)`   | Asserts that the defined percentile of the column is in the inclusive range `[min,max]`. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                                                                       |
| `assertAscending(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final String... groupByColumns)`                                                                 | Asserts that sub-groups of a column have monotonically increasing values. Consecutive values within a group must be equal or increasing. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                       |
| `assertStrictlyAscending(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final String... groupByColumns)`                                                         | Asserts that sub-groups of a column have monotonically strictly increasing values. Consecutive values within a group must be increasing. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                       |
| `assertDescending(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final String... groupByColumns)`                                                                | Asserts that sub-groups of a column have monotonically decreasing values. Consecutive values within a group must be equal or decreasing. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                       |
| `assertStrictlyDescending(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final String... groupByColumns)`                                                        | Asserts that sub-groups of a column have monotonically strictly decreasing values. Consecutive values within a group must be decreasing. To use this rule on the whole data set, use the partitioning column as the sole value for `groupByColumns`.                       |
| `assertExpectedTableSize(final int partitionsBefore, final int partitionsAfter, final double min, final double max)`                                                                                               | Asserts the number of rows in the table is within a specified fraction of the expected size, determined by looking at the other tables in the database. For example, `min=0.8` and `max=1.2` would assert the table size is within 80% and 120% of the typical table size. |
| `assertCountEqual(final boolean removeNull, final boolean removeNaN, final boolean removeInf, final String column, final Object value1, final Object value2)`                                                      | Asserts that a column contains the same number of rows for two given values.                                                                                                                                                                                               |

## Related documentation

- [PQ management](../../query-management/ui-queries.md)
- [Process Event Log](../../sys-admin/internal-tables/process-event-log.md)