Skip to content

Dictionary

Dictionary data sources and sinks enable recipes to use in-memory data as runtime variables or employ Jinja functions to manage or transform files for downstream inputs and testing. It is essentially a JSON dictionary, which can be shared across any sources and sinks with the same data store (for example, an Amazon S3 bucket).

This connector provides flexibility in recipe building beyond the standard runtime variables available for sources and sinks in node configurations.

  • Use a Dictionary data source to send data via a runtime variable to a data sink with a DataMapper node.
  • Use a Dictionary to move data from a source to a runtime variable for use in tests or as ingredient inputs.
  • Use a Dictionary data source in a container node to copy a shared recipe file into the container.
  • Use a Dictionary data sink to put data from a different source into a runtime variable and use in downstream nodes to process files.

Warning

Dictionaries support small datasets. Dictionary data sources and sinks have a data storage limit of 1 MB and are not intended for use with large datasets.

Dictionary data sources and sinks are in the system category of I/O connectors.

See Dictionary Setup Examples for a guided tour of a dictionary data sink configuration.

Connector type values

The "type": value to use in the source or sink JSON files.

Connector type value
Data source DKDataSource_Dictionary
Data sink DKDataSink_Dictionary

Connection properties

The Dictionary data sources and sinks do not require connection configuration as they leverage your Automation user account credentials from your current session.

System source and sink properties

Field Type Required? Description
set-runtime-vars dictionary no Used to declare runtime variables set equal to built-in variables. See File-Based Source and Sink Variables for more information.

Source and sink properties

Field Type Required? Description
bucket-name alphanumeric,  underscore (_) no Specifies an in-memory location where the dictionary is stored.

Different data sources and sinks with same bucket-name share access to dictionary and its data. When bucket-name is not defined, dictionary access is only available to data sources and sinks that share the same name.

Step (key) properties

Field Type Required? Description
variable alphanumeric, see runtime variable in Naming Conventions no Supported in data sinks only
Specifies the name of a runtime variable that will be used to store the data being sent to this key.

Data source logic

The system follows this process for each dictionary source input:

  • First, check for a value defined as an input mapping. If a value exists, map it into the specified container file.

    An input mapping is stored in the node's data_sources/*.json file, where keyname is the Mapping Name entry and value is the source's JSON Value entry.

  • If there is no value defined as an input mapping, check the bucket defined in Source Connections for a value. If a value exists, map it into the specified container file.

    • The default bucket is an Automation internal resource and shares the name of the source, but users may designate another bucket name in the Connections tab or in data_sources/*.json. Any other sources and sinks with the same bucket name will share the data stored in them.

Note

Dictionary source priority and syntax: Input mapping key values take precedence over data already stored in a shared bucket by a dictionary sink from a previous node. If the dictionary is intended to retrieve data from a previous sink, the Target Variable field in the output mapping must have an empty value, such as "" or null or [] or 0 or {}.

Data source examples

Example 1

This example shows three potential key values for moving data into a container.

{
    "name" : "my_dict_datasource",
    "type" : "DKDataSource_Dictionary",
    "config": {
        "bucket-name" : "source"
    },
    "keys" : {
        "mapping1" : { ".... some arbitrary JSON data ... "},
        "mapping2" : "{{the_value_of_a_variable_you_want_to_send_to_a_sink}}",
        "mapping3" : "{{load_text('some_file_you_put_in_resources_to_be_sent.csv')}}"
    }
}

Example 2

This example shows how a dictionary data source is used to copy a Python script into the container. It uses the load_text Jinja function to get the file from the recipe's resources directory, where it can be shared among multiple container nodes. This particular script is used to either start or stop an instance of a SQL Server Analysis Service.

Composite mappings in the Inputs tab of the Node Editor

  • Mapping Name: manage_ssas
  • Source JSON Value:"{{load_text('manage_ssas_instance.py')}}"
  • Container Target File Path: manage_ssas_instance.py

data_sources/dict_datasource.json

{
    "name": "dict_datasource",
    "type": "DKDataSource_Dictionary",
    "config": {
        "bucket-name": "source"
    },
    "keys": {
        "manage_ssas": "{{load_text('manage_ssas_instance.py')}}"
    }
}

notebook.json

{
    "image-repo": "{{dockerhubConfig.image_repo.general_purpose}}",
    "image-tag": "{{dockerhubConfig.image_tag.general_purpose}}",
    "dockerhub-namespace": "{{dockerhubConfig.namespace.general_purpose}}",
    "container-input-file-keys": [
        {
            "filename": "manage_ssas_instance.py",
            "key": "dict_datasource.manage_ssas"
        }
    ],
    "tests": {
        "test_success": {
            "action": "stop-on-error",
            "test-variable": "success",
            "type": "test-contents-as-boolean",
            "test-logic": {
                "test-compare": "equal-to",
                "test-metric": "True"
            }
        }
    }
}

Data sink logic

The system follows this process for each dictionary sink output:

  • First, check for a value defined as an output mapping Target Variable.
    • If a value exists, set it on the target runtime variable.
  • If there is no target variable defined, check the bucket defined in Source Connections for a value. If a value exists, set it on the key in the bucket.
    • The default bucket is an Automation internal resource and shares the name of the sink, but users may designate another shared bucket name in the Connections tab or in data_sinks/*.json. Any other sources and sinks with the same bucket name will share the data stored in them.

Data sink example

In this example, there are two keys showing the two possible paths for storing sink values.

  • mapping1 will send the data to a runtime variable called output_target_var. is a placeholder for the location to send data to the shared bucket.
  • mapping2 is null, triggering the system to send data to the default bucket.
{
    "name" : "my_dict_datasink",
    "type" : "DKDataSink_Dictionary",
    "config": {
        "bucket-name" : "sink"
    },
    "keys" : {
        "mapping1" : {
            "variable" : "output_target_var"
        },
        "mapping2" : null
    }
}