Consistency Tests¶

Consistency tests verify that data values remain stable and aligned with established baselines or business rules. They detect drift in statistical measures, unexpected changes to constant or categorical values, and violations of user-defined conditions — catching problems that emerge when data stops agreeing with itself or with prior known-good states.

Average Shift¶

Tests for a statistically significant shift in the column mean compared to the average calculated at baseline.

Auto-generated for numeric measurement columns with more than 10 distinct values.


Scope	Column
Measures	Cohen's D Difference (0.20 = small, 0.50 = moderate, 0.80 = large, 1.20 = very large)
Threshold	Maximum acceptable Cohen's D Difference (default: 0.5)
On failure	Standardized difference between averages is over the selected threshold level.
Default severity	Warning

How it works: Computes the pooled Cohen's D statistic — the standardized difference between the current column average and the baseline average, normalized by their combined standard deviation. The test fails when this difference exceeds the threshold.

When to use: Average Shift is useful for any numeric measure where you expect the mean to remain stable over time. A significant shift may indicate errors in processing, differences in source data, or valid changes that could impact downstream assumptions. You may want to periodically adjust the baseline average if legitimate drift is expected.

Constant Match¶

Tests that all values in the column still match the single constant value identified during baseline profiling.

Auto-generated when profiling finds a column with only one distinct value.


Scope	Column
Measures	Count of values not matching the constant
Threshold	Maximum acceptable count of mismatched values (default: 0)
On failure	Column values do not match expected constant value.
Default severity	Fail

How it works: Counts every record where the column value differs from the expected constant. The test fails when this count exceeds the threshold.

When to use: This test is valuable for columns that should always contain a fixed value — such as a default status, a country code for a single-region dataset, or a version identifier. Sometimes new data reveals that a value is not truly constant, even though only one value was present at profiling time. In that case, disable this test and consider using Value Match instead.

Value Match¶

Tests that all column values appear in the expected list of values identified during baseline profiling.

Auto-generated when profiling finds a Boolean, Code, or Category column with 2 to 10 distinct values.


Scope	Column
Measures	Count of records with values not in the expected list
Threshold	Maximum acceptable count of non-matching records (default: 0)
On failure	Values not matching expected list of values from baseline.
Default severity	Fail

How it works: Counts every record where the column value (excluding NULLs and empty strings) does not appear in the expected value list. The test fails when this count exceeds the threshold.

When to use: This test is relevant when the list of allowable values is small and not expected to change often. Even if new values might occasionally be added, it provides an early warning that downstream assumptions and logic may need to change. The expected value list is populated automatically from profiling results but can be edited.

Value Match All¶

Tests that column values exactly match a predefined list — every value must be in the list, and every list value must appear at least once.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Column
Measures	Pipe-delimited string of all distinct values found
Threshold	Expected pipe-delimited string of all values (in sorted order)
On failure	Column values do not exactly match expected list of values.
Default severity	Fail

How it works: Collects all distinct column values into a sorted, pipe-delimited string and compares it to the expected string. The test fails if the two strings differ — meaning either an unexpected value appeared or an expected value is missing.

When to use: This is a stricter form of Value Match. While Value Match only checks that no unexpected values exist, Value Match All also verifies that every expected value is represented at least once. This is appropriate for tables where all category values should appear in each data refresh.

Decimal Truncation¶

Tests that the sum of fractional (decimal) values in a numeric column has not decreased compared to baseline, which could indicate truncation.

Auto-generated when profiling finds a numeric column with non-zero fractional values in a cumulative table.


Scope	Column
Measures	Sum of all fractional parts of column values
Threshold	Minimum expected sum of fractional values (default: rounded baseline fractional sum)
On failure	Sum of fractional values is under baseline sum, which may indicate decimal truncation.
Default severity	Fail

How it works: Extracts the fractional part of each value (the portion after the decimal point), sums them, and compares against the baseline threshold. The test fails when the current sum falls below the expected minimum.

When to use: This test catches cases where decimal precision is lost — for example, when a float column is inadvertently cast to integer, or when an ETL process truncates decimal places. It is most appropriate for cumulative datasets where prior values should remain unchanged.

Timeframe No Drops¶

Tests that column values present in the most recent time window include at least all values from the prior time window of the same duration.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of missing value combinations
Threshold	Expected count of missing value combinations (default: 0)
On failure	Column values in most recent time window do not include all values in prior time window.
Default severity	Fail

How it works: Divides a table into two consecutive time windows based on a date column and a window length in days. Collects the distinct values (or value combinations) from each window and identifies any that appear in the earlier window but not the later one. The test fails when the count of dropped values exceeds the threshold.

When to use: Use this test to confirm that codes or categories are not lost across successive time periods in a transactional table. Missing values in the latest window trigger a failure; new values are permitted. Specify one or more categorical columns to compare — avoid using continuous measurements.

Timeframe Match¶

Tests that column values in the most recent time window exactly match those in the prior time window of the same duration.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of mismatched value combinations
Threshold	Expected count of non-matching value combinations (default: 0)
On failure	Column values do not match in most recent time windows.
Default severity	Fail

How it works: Divides a table into two consecutive time windows based on a date column and a window length in days. Collects the distinct values (or value combinations) from each window and identifies any differences — values added or removed. The test fails when the count of mismatched values exceeds the threshold.

When to use: This is a stricter form of Timeframe No Drops. Both new and missing values in the latest window trigger a failure. Use this to confirm the consistency of codes or categories across successive time periods in a transactional table.

Custom Condition¶

A user-defined test that evaluates a SQL expression against each record in the table and counts how many records match the error condition.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Custom
Measures	Count of records matching the error condition
Threshold	Maximum acceptable error count (default: 0)
On failure	Values found not matching defined condition.
Default severity	Fail

How it works: Evaluates a user-defined SQL expression against every record in the table. The expression should return TRUE when a record is in error. The test counts all matching records and fails when that count exceeds the threshold.

When to use: Custom Condition is ideal for business-rule checks that TestGen cannot automatically infer, especially conditions that involve multiple columns in the same record. For example, you might test that quantity_shipped <= quantity_ordered, or that end_date >= start_date. Because it runs as part of the aggregate test batch, performance remains fast.

Aggregate Balance¶

Tests that aggregate values per grouping match exactly between the source table and a reference table.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of mismatched group totals
Threshold	Expected count of group totals not matching aggregate value (default: 0)
On failure	Aggregate measure per set of column values does not exactly match reference dataset.
Default severity	Fail

How it works: Rolls up an aggregate expression (such as SUM or COUNT) by grouping columns in both the source and reference tables, then compares the totals for each group. Any group where the values differ — or where a group exists in one table but not the other — counts as a mismatch.

When to use: Use this test to verify that two tables produce identical aggregate results — for example, confirming that a summary table exactly matches its source detail table. Because the comparison is exact, even small rounding differences or new categories will trigger a failure.

Aggregate Balance Percent¶

Tests that aggregate values per grouping fall within a percentage range of the reference table values.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of mismatched group totals
Threshold	Expected count of group totals outside percentage range (default: 0)
On failure	Aggregate measure per set of column values is outside percent range of reference dataset.
Default severity	Fail

How it works: Rolls up an aggregate expression by grouping columns in both tables, then checks whether each group's value in the source table falls within a lower and upper percentage tolerance of the corresponding reference value. Groups outside this range count as mismatches.

When to use: Use this test when you expect two datasets to be similar but not identical — for example, comparing sales by product between two time periods. This is more forgiving than Aggregate Balance while still catching significant deviations.

Aggregate Balance Range¶

Tests that aggregate values per grouping fall within a fixed numeric range of the reference table values.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of mismatched group totals
Threshold	Expected count of group totals outside numeric range (default: 0)
On failure	Aggregate measure per set of column values is outside expected range of reference dataset.
Default severity	Fail

How it works: Rolls up an aggregate expression by grouping columns in both tables, then checks whether each group's value in the source table falls within a fixed lower and upper tolerance of the reference value. The tolerances are expressed as constant values, not percentages.

When to use: Use this test when acceptable variation is best expressed as a fixed amount rather than a percentage — for example, when comparing weekly totals where a difference of up to 10,000 units is acceptable regardless of the baseline size.

Aggregate Minimum¶

Tests that aggregate values per grouping in the source table are at least as large as those in the reference table.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of groups below reference value
Threshold	Expected count of group totals below reference aggregate (default: 0)
On failure	Aggregate measure per set of column values is not at least the same as reference dataset.
Default severity	Fail

How it works: Rolls up an aggregate expression by grouping columns in both tables, then checks that each group's value in the source table meets or exceeds the reference value. Groups where the source value is lower count as failures.

When to use: Use this test to confirm that aggregate values have not dropped between an older and newer version of a cumulative dataset — values may rise but should never fall. Both tables must be accessible at the same time.

Reference Match¶

Tests that column values or combinations found in the source table also exist in a reference table.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Count of missing values
Threshold	Expected count of non-matching value combinations (default: 0)
On failure	Column value combinations are not found in reference table values.
Default severity	Fail

How it works: Retrieves the distinct values (or value combinations) from the source table and removes any that also appear in the reference table. The remaining unmatched values are counted. The test fails when this count exceeds the threshold.

When to use: Use this test to verify referential integrity between related tables — for example, confirming that every product code in a fact table exists in the product dimension. Both tables must be accessible at the same time.

Distribution Shift¶

Tests that the probability distribution of values across categories is consistent with a reference dataset.

This test is not auto-generated. You can create it manually from the Test Definitions page.


Scope	Referential
Measures	Divergence level (0 = identical, 1 = maximum divergence)
Threshold	Expected maximum divergence level between 0 and 1
On failure	Divergence between two distributions exceeds specified threshold.
Default severity	Warning

How it works: Counts records per category in both the source and reference tables, converts the counts to proportional distributions, and measures the difference using Jensen-Shannon divergence. A score of 0 means the distributions are identical; a score of 1 means they share no common structure. The test fails when the divergence exceeds the threshold.

When to use: Use this test when two datasets should have a similar spread across categories but are not expected to match exactly. The divergence score is also useful as a stability metric over time — even if two distributions are never identical, you would expect a consistent divergence score from period to period.