Data Catalog¶
Use the Data Catalog to investigate data quality issues, explore profiling results across tables and columns, and organize your data assets with metadata tags. The catalog brings together profiling statistics, hygiene issues, test results, PII detection, and quality scores for every table and column in a table group.

Prerequisites¶
- A table group configured for your database connection.
- At least one completed profiling run for that table group. Profiling discovers your tables and columns and populates the catalog with metadata and statistics.
The catalog automatically reflects the latest results as you run additional profiling and testing cycles. Run profiling regularly to keep your catalog current.
Explore your data¶
Select a Table Group from the dropdown at the top of the page. The left panel displays a hierarchical tree of tables and columns — click any item to view its details in the right panel.
Use Search to filter the tree by table or column name. For more targeted exploration, click the filter icon to narrow the tree to specific Critical Data Elements or metadata tags (such as Business Domain, Data Source, or Transform Level).
Investigate table and column quality¶
When you select a table or column, the detail panel shows its data quality scores (overall, profiling, and testing), along with any detected issues. Use this view to assess the health of specific data assets and decide where to focus remediation.
Spot issues¶
The detail panel surfaces three categories of issues:
- Potential PII — Columns where profiling detected patterns matching personally identifiable information, with PII type and risk level.
- Hygiene Issues — Data quality anomalies from profiling, categorized by likelihood (Definite, Likely, or Possible). See Data Hygiene Issues for details.
- Test Issues — Failures and warnings from the most recent test run for each associated test suite. Only confirmed results are shown; dismissed results are excluded.
Each issue category links to its full detail view for further investigation.
Review profiling results¶
For columns, the detail panel shows value distributions and profiling statistics tailored to the column's data type — including missing value breakdowns, frequency distributions, statistical summaries, and pattern analysis. For a full breakdown of what profiling collects for each column type, see Profiling Statistics.
You can also review a column's suggested data type when profiling detects that a more appropriate type exists (e.g., a varchar column that only contains integers).
Preview source data¶
Click Data Preview to retrieve up to 100 distinct sample rows directly from the source database without leaving the catalog. For tables, the preview shows all columns; for individual columns, it shows distinct values. This is useful for verifying what the actual data looks like when investigating an issue.
Track changes over time¶
Click History on a column's value distribution to compare profiling results across runs. Select any past profiling run to see the statistics for that point in time, making it easy to spot trends or regressions.
Generate a table CREATE script¶
For any table, you can generate a SQL CREATE TABLE statement that uses the suggested data types from profiling. Where the suggested type differs from the current database type, the script includes a comment noting the original type (e.g., -- WAS varchar(255)). This is useful for reviewing type optimization recommendations or generating migration scripts.
Organize and classify your data¶
The Data Catalog lets you annotate tables and columns with metadata tags, descriptions, and classification flags. These annotations help you classify your data assets, filter the catalog, and influence how quality scores are weighted.
Flag critical data elements¶
The Critical Data Element (CDE) flag marks tables and columns as high-priority data assets. When a scorecard is configured to track CDE scores, a separate quality score is computed using only CDE-flagged columns, giving you a focused view of data quality for your most important data. See Quality Scores for details.
CDE flags can be set in two ways:
- Automatic detection — When enabled on a table group, profiling identifies likely critical data elements based on column characteristics. Detected columns are flagged automatically.
- Manual flagging — Set the CDE flag directly on any table or column in the catalog, in bulk, or via CSV import.
The CDE flag supports three states: Yes, No, or Inherit (from the parent table).
Flag PII columns¶
The PII flag marks columns as containing personally identifiable information. PII flags can be set in two ways:
- Automatic detection — When enabled on a table group, profiling scans column names and data patterns for matches to known PII categories (such as SSNs, credit card numbers, email addresses, phone numbers, and national IDs). Detected columns are flagged automatically.
- Manual flagging — Set the PII flag directly on any column in the catalog, in bulk, or via CSV import.
PII-flagged column values are masked across the application for users without PII access, protecting sensitive data while still allowing data quality work to proceed.
Flag excluded data elements¶
The Excluded Data Element (XDE) flag marks columns that should be excluded from profiling analysis and automated test generation. Use this for non-business columns (such as audit timestamps or internal IDs) that add noise to profiling results without providing data quality value.
Set the XDE flag directly on any column in the catalog, in bulk, or via CSV import.
Tag tables and columns¶
Select a table or column to assign metadata tags across eight categories (see Metadata tags below). Tag values support autocomplete based on values already used across your catalog to help maintain consistency.
You can also add a free-text Description to provide business context or usage notes.
Tip
Tags set on a table group are inherited by its tables, and tags set on a table are inherited by its columns. You can override any inherited value at a lower level. This means you can tag at the table group level and only override where needed.
Edit metadata in bulk¶
To update metadata across multiple tables and columns at once:
- Toggle on Edit multiple in the tree toolbar.
- Select the tables and columns you want to update. Selecting a table automatically includes all its columns.
- In the batch edit form, check the fields you want to change and enter the new values. Unchecked fields keep their current values.
- Click Save.
Import metadata from CSV¶
To update metadata across many tables and columns at once, you can import a CSV file. To get started, export your existing metadata as CSV and use it as a template:
- Click Import in the catalog toolbar.
- Select a CSV file. The file should include a
tablecolumn, acolumncolumn, and any combination of metadata fields to update:description,cde,pii,xde, and the eight metadata tags (data_source,source_system,source_process,business_domain,stakeholder_group,transform_level,aggregation_level,data_product). - Choose how to handle blank cells: Keep leaves existing values unchanged, while Clear sets them to empty.
- Review the preview. The import validates your data and flags any issues — duplicate rows, unrecognized values for boolean fields (CDE, PII, XDE), and values that exceed length limits.
- Click Import to apply the changes.
Only rows that match existing tables and columns in the selected table group are imported. Unmatched rows are skipped.
Note
The PII column in the CSV is ignored for users without PII access. All other metadata fields in the file are still imported normally.
Export catalog data¶
Click Export to download catalog data. Two export formats are available:
- Excel — A full report with CDE/PII/XDE status, data types, profiling statistics, value distributions, descriptions, and tag values. You can scope the export to all columns, only the columns matching your current filters, or only selected columns (when multi-edit mode is active).
- Metadata CSV — A lightweight CSV containing only metadata: descriptions, CDE/PII/XDE flags, and tag values. This format is compatible with CSV import, so you can export, edit in a spreadsheet, and re-import.
Reference¶
Metadata tags¶
| Tag | Description |
|---|---|
| Data Source | Original source of the dataset |
| Source System | Enterprise system source for the dataset |
| Source Process | Process, program, or data flow that produced the dataset |
| Business Domain | Business division responsible for the dataset, e.g., Finance, Sales, Manufacturing |
| Stakeholder Group | Data owners or stakeholders responsible for the dataset |
| Transform Level | Data warehouse processing stage, e.g., Raw, Conformed, Processed, Reporting, or Medallion level (Bronze, Silver, Gold) |
| Aggregation Level | Data granularity of the dataset, e.g., Atomic, Historical, Snapshot, Aggregated, Time-Rollup, Rolling, Summary |
| Data Product | Data domain that comprises the dataset |