Quality Scores¶

TestGen quality scores quantify the health of your data by measuring the ratio of clean data to total data. Scores are computed automatically from profiling results and test outcomes, giving you a single metric to track data quality over time. Use scorecards to define which slices of your data to score and monitor on the Quality Dashboard.

What quality scores measure¶

Quality scores combine results from two sources:

Profiling scores — Based on hygiene issues detected during profiling. Each anomaly has a prevalence (the actual or estimated fraction of records affected), and these are combined across columns to produce a score.
Testing scores — Based on failures detected during test execution. Each test failure's prevalence contributes to the score in the same way.

The overall quality score is the product of these two components: overall_score = profiling_score x testing_score. This means both profiling and testing must show strong results for the overall score to be high.

Key behaviors:

If you dismiss a hygiene issue or test failure, it no longer affects the score.
When a column has multiple issues, prevalences are combined probabilistically rather than summed, so overlapping issues do not inflate the score impact.
Monitor results are excluded from quality scores. Monitors track operational patterns separately — see Monitor Tables.

Scorecards¶

A scorecard is a saved definition that specifies which subset of your data to score and how to display the results. You can create multiple scorecards to track quality from different perspectives — for example, one scorecard per business domain, one for critical tables, or one for each data source.

Each scorecard can display:

Total Score — The overall quality score across all matching columns.
CDE Score — A separate score computed only from columns flagged as Critical Data Elements, highlighting the quality of your most important data.
Category breakdown — Score indicators for each value within a chosen category (such as Table Group, Business Domain, or Quality Dimension).

Scorecards appear as tiles on the Quality Dashboard, each showing live scores and a history sparkline. Scores refresh automatically after every profiling or test run.

(Quality Dashboard page showing scorecard tiles)

Getting started¶

To begin using quality scores, you need at least one completed profiling run and one completed test run for a table group. Once you have results, use the Score Explorer to explore score breakdowns and save a scorecard. See View Score Details for investigating trends and drilling into issues.

Tip

The Data Configuration Setup and Add Table Group wizards can create a scorecard for a table group as part of their setup flow.

Reference¶

How scores are calculated¶

At any level, the score represents the percentage of clean data points relative to the total number of data points at that level. After every profiling or test run, TestGen computes a data quality score starting at the column level and rolling up from there.

When a column has multiple issues, prevalences are combined probabilistically rather than summed, so overlapping issues do not inflate the score impact.
Each column's score is weighted by its record count when aggregated to higher levels.
A scorecard's overall score combines the profiling score and testing score.
The CDE score uses the same calculation but only includes columns flagged as Critical Data Element.

Score color thresholds¶

● Green (96%+) — Strong quality. Very few issues.
● Yellow (91-95%) — Warrants attention.
● Orange (86-90%) — Below expectations. Investigate.
● Red (below 86%) — Poor quality. Action needed.

Score breakdown categories¶

Scores can be broken down by 13 categories for analysis. Categories draw from metadata on your table groups, columns, and test types:

Column Name, Table Name, Table Group — Pinpoint which specific data assets have issues.
Quality Dimension — Break down by Accuracy, Completeness, Consistency, Timeliness, Uniqueness, or Validity.
Semantic Data Type — Group by profiling-inferred type (Email, Phone, Date, etc.).
Data Location, Data Source, Source System, Source Process — Group by data origin.
Business Domain, Stakeholder Group — Group by organizational ownership.
Transform Level, Data Product — Group by data pipeline stage or product.

Explore and Create Scorecards

View Score Details

Data Catalog