Data Hygiene Issues¶
This reference covers the 32 data hygiene issues that TestGen detects during a profiling run. Hygiene issues are structural inconsistencies and content anomalies — such as data type mismatches, non-standard blank values, and potential PII — that may indicate data quality problems.
To review and act on hygiene issues detected in your data, see Investigate Profiling Results.
Likelihood scale¶
Each hygiene issue is assigned a likelihood rating that indicates how strongly the detected pattern suggests an actual data problem:
- Definite — Indicates a data problem.
- Likely — Typically indicates a data problem.
- Possible — A speculative finding that often indicates problems but may be benign.
Issue categories¶
| Category | Description |
|---|---|
| Data type and format | Column values don't match the expected type, format, or structural standard — including detection of sensitive data patterns. |
| Missing and incomplete data | Expected values are absent, blank, non-standard, or potentially duplicated. |
| Value and pattern consistency | Values are inconsistent within a column, across tables, or with expected value sets. |
| Unexpected content and dates | Column content doesn't match what the column name suggests, or date values fall outside expected ranges. |