Skip to content

Test Patterns and Examples

Tests can be built into any workflow to ensure that both code and data are error-free.

Every processing or transformation step should include tests that check inputs, evaluate results against business logic, and measure outputs against expected results.

Tests to verify Inputs

These tests confirm the accuracy, completeness, and consistency of data fed into an analytics process to help prevent bad data from a source.

  • Count verification: Checks that row or record counts or file sizes are in the right range. This is a simple count for incoming data, whereas location balance tests can check that data sets remain intact through the entire pipeline.
  • Recency: Checks that data is up to date.
    • For example, the latest sales order date in a data file is not more than one week ago.
    • See Recency Test Example.
  • Conformity: Checks that data is in an expected format.
    • For example, US ZIP codes are five digits.
  • Historical balance: Checks that data changes fall into an expected trend by comparing current data to previous or predicted values.
    • For example, the number of prospects always increases, or order volume does not rise or drop unexpectedly.
    • See the example in Historical Comparisons.
  • Location balance: Checks that data properties (counts, dimensions, facts) do not vary by more than an acceptable range at each stage of the pipeline and alerts the team of possible data loss.
    • For example, the customer count is always more than a base value, or the number of rows in a data set remains the same through multiple transformations.
    • See Location Balance Test Example.
  • Time balance: Measures multiple aspects of the data pipeline to identify negative trends. These tests are more commonly referred to as statistical process control (SPC).
  • Consistency: Checks that data follows specific norms.
    • For example, transaction dates are in the past, end dates are later than start dates, or body temperature is around 98.6F/37C.
  • Field validation: Checks that required fields are present and correctly entered.

Test against business logic

This test confirms that the data match business assumptions and analytic methods to verify analytic operations.

  • Data validation: Checks that data is distributed or transformed properly by analytic tools and processes.
    • For example, each customer exists in a dimension table or 90% of data match entries in a dimension table.

Tests to verify outputs

These tests confirm that the data matches business assumptions to verify that a pipeline stage has been executed correctly.

  • Completeness: Checks that data trends as expected. Completeness tests are historical balance tests for analytic outputs.
    • For example, the number of customer prospects should increase over time.
  • Range verification: Checks that data falls within expected boundaries. Range tests are conformity tests for analytic outputs.
    • For example, the number of physicians in the US is less than 1.5 million

Complex testing

See Custom Logic Expressions to get started with complex testing.