Skip to content

Mapping File to Database Example

  • Use Case: Pipelines commonly require that input from file-based sources be loaded to a database in order to support analytics queries.
  • Example Recipe: In this example, the recipe gets a file of sales data from an SFTP server and loads it to Azure Data Lake Storage. Then, it creates a schema and table in an MS SQL Server database and loads the file to that table.
  • DataMapper Role: A DataMapper node gets the data from ADLS and loads it to the MS SQL instance.

()

A DataMapper node moves data from the data lake to a database table.

Web App Configuration

The following images show how the node is configured using the web app forms.

  • Connections to the data source and data sink
  • Mapping to source file and the runtime variable to record
  • Mapping to sink table and the data format settings for bulk copy
  • Tests configurations using the runtime variables captured

()

The data source and data sink connections configured in the web app.

()

The data source mapping configured in the web app for the DataMapper node.

()

The data sink mapping configured in the web app for the DataMapper node.

()

Test configured to run against the sink table to verify that the expected data is present.

File Contents

The platform records the configurations shown above from web app pages in the appropriate node files: data_sources/source.json and data_sinks/sink.json. The mapping keys appear in the node's notebook.json file, and recipe variables can be found in the variables.json recipe file.

data_sources/source.json

{
    "name": "source",
    "type": "DKDataSource_ADLS2",
    "config-ref": "adslConfig",
    "keys": {
        "mapping1_source": {
            "file-key": "{{global_superstore_s3_path}}",
            "use-only-file-key": true,
            "set-runtime-vars": {
                "row_count": "num_new_orders"
            }
        }
    }
}
 

data_sinks/sink.json

{
    "name": "sink",
    "type": "DKDataSink_MSSQL",
    "config": {
        "username": "{{mssqlConfig.username}}",
        "password": "{{mssqlConfig.password}}",
        "hostname": "{{mssqlConfig.hostname}}",
        "port": "",
        "database": "{{mssqlConfig.database}}"
    },
    "keys": {
        "mapping1_sink": {
            "query-type": "freebcp",
            "table-name": "{{schema_name}}.new_raw_demo_superstore_orders",
            "col-delimiter": "|"
        }
    },
    "tests": {
        "QA_Num_New_Orders_Min": {
            "action": "stop-on-error",
            "test-variable": "num_new_orders",
            "type": "test-contents-as-integer",
            "test-logic": {
                "test-compare": "greater-than",
                "test-metric": 1000
            }
        }
    }
}