Skip to content

Data Hygiene Issues

This reference covers the 32 data hygiene issues that TestGen detects during a profiling run. Hygiene issues are structural inconsistencies and content anomalies — such as data type mismatches, non-standard blank values, and potential PII — that may indicate data quality problems.

To review and act on hygiene issues detected in your data, see Investigate Profiling Results.

Likelihood scale

Each hygiene issue is assigned a likelihood rating that indicates how strongly the detected pattern suggests an actual data problem:

  • Definite — Indicates a data problem.
  • Likely — Typically indicates a data problem.
  • Possible — A speculative finding that often indicates problems but may be benign.

Issue categories

Category Description
Data type and format Column values don't match the expected type, format, or structural standard — including detection of sensitive data patterns.
Missing and incomplete data Expected values are absent, blank, non-standard, or potentially duplicated.
Value and pattern consistency Values are inconsistent within a column, across tables, or with expected value sets.
Unexpected content and dates Column content doesn't match what the column name suggests, or date values fall outside expected ranges.