Skip to content

Location Balance Test Example

  • Use case: Verify that sales order data is valid.
  • Example recipe: the recipe uses an action node to load a global superstore data set to an Amazon RedShift database. A location balance test has been written to ensure that the percentage of invalid US ZIP codes is less than 1% of all order records, a maximum threshold deemed acceptable by the company.
  • Node type: action node.

Step 1: Define runtime variable

When configuring steps in the action node, write a SQL script to query a preconfigured table with invalid ZIP codes and compute the percentage.

Visual example in Automation

A scalar output from a SQL script records the percentage of invalid ZIP codes.

The action node step calculates the percentage of orders with invalid ZIP codes and stores it in the declared runtime variable.

(Screenshot of defining the variable)

Summary of how-to

  1. Open the node editor.
  2. From the Steps tab, use the radio button to select SQL Inline Query, then enter the SQL script.
  3. Define the Scalar Result.
  4. Select Add notes for changelog , describe the changes, then click Update.

Step 2: Configure test

Write a location balance test for the node, confirming the ZIP codes.

Visual example in Automation

The test verifies that the invalid ZIP codes do no represent more than 1% of the data set, that is, the system measures value of the defined variable to ensure it's not greater than 1.

(Screenshot of the Tests tab)

Summary of how-to

  1. In the same node, select the Tests tab.
  2. Create a location balance test.

    1. Click Add Test.
    2. Fill out the fields Test Name, Failure Action, Test Logic, and Control Value.

      Where Test Logic is Compare variable against metric and the Test Variable is the runtime variable scalar result from Step 1.

  3. Select Add notes for changelog , describe the changes, then click Update.

File contents

The system records the test configurations shown above in the actions/load-and-cleanse-raw.json file (named after the action node name).

{
    "name": "load-and-cleanse-raw",
    "type": "DKDataSource_PostgreSQL",
    "config-ref": "redshiftConfig",
    "keys": {
        . . .

        },
        "Compute_Percent_Invalid_US_Zip_Codes": {
            "description": "Compute and store the percentage of new orders with invalid zip codes for use in testing.",
            "sql": "SET search_path TO {{schema_name}};\n\nSELECT (100.0 * (SELECT COUNT(*) FROM demo_superstore_orders_bad_us_zip) / (SELECT COUNT(*) FROM new_raw_demo_superstore_orders))::DECIMAL(10,2);",
            "query-type": "execute_scalar",
            "set-runtime-vars": {
                "result": "percent_invalid_us_zip_codes"
            }
        },
        . . .
    },
    },
    "tests": {
        . . .

        },
        "QA_Percent_Invalid_US_Zip_Codes": {
            "action": "warning",
            "test-variable": "percent_invalid_us_zip_codes",
            "type": "test-contents-as-float",
            "test-logic": {
                "test-compare": "less-than",
                "test-metric": 1
            }
        },
        . . .
        }
    }
}