Shared Resources Example¶
This code-based example demonstrates how to use a shared resource from a recipe's resources directory. The files in that directory are available to all nodes within a recipe.
The only files that are automatically injected in a container node are those that exist inside the node file structure, such as scripts in the docker-share directory. Files intended to be shared across nodes must be placed in the resources directory and, therefore, must be explicitly loaded to a node. By adding a dictionary data source to the container node, you can access a shared resource at runtime.
Note
Container images are built with the DataKitchen Interface Layer and require specific files and file structures to be present in a container node. For more information, see GPC File Structure and Configuration.
File structure¶
/recipe/
├── description.json
├── resources/
│ └── python_scripts/
│ └── resources_basic.py
├── shared_resources_example/
│ ├── data_sources/
│ │ └── dict_datasource.json
│ ├── description.json
│ ├── docker-share/
│ │ └── config.json
│ ├── notebook.json
│ └── variables.json
└── variations.json
File contents¶
config.json¶
The config.json file defines which file you want to use in the script field. The dict_datasource.json actually determines where
the file will go.
{
"apt-dependencies": [ ],
"dependencies": [ ],
"keys": {
"python_script": {
"script": "resources_basic.py",
"environment": {
"SIMPLE_ENV_VAR": "simple_env_var",
"JINJA_ENV_VAR": "{{basic_example_node.RECIPE_VAR}}",
"VAULT_ENV_VAR": "#{vault://vault/url}"
},
"parameters": {
"SIMPLE_PARAM": "simple_param",
"JINJA_PARAM": "{{basic_example_node.RECIPE_VAR}}",
"VAULT_PARAM": "#{vault://vault/url}"
},
"export": [
"success"
]
}
}
}
resources_basic.py¶
The resources/python_scripts/resources_basic.py file, stored in the python_scripts subdirectory of the recipe's resources directory,
is the shared file that this example calls.
import os
import sys
import traceback
global success
# Validate environment variables
if 'SIMPLE_ENV_VAR' not in os.environ or not os.environ['SIMPLE_ENV_VAR']:
LOGGER.error("Undefined SIMPLE_ENV_VAR")
sys.exit(1)
if 'JINJA_ENV_VAR' not in os.environ or not os.environ['JINJA_ENV_VAR']:
LOGGER.error("Undefined JINJA_ENV_VAR")
sys.exit(1)
if 'VAULT_ENV_VAR' not in os.environ or not os.environ['VAULT_ENV_VAR']:
LOGGER.error("Undefined VAULT_ENV_VAR")
sys.exit(1)
# Validate parameters
if 'SIMPLE_PARAM' not in globals() or not SIMPLE_PARAM:
LOGGER.error("Undefined SIMPLE_PARAM")
sys.exit(1)
if 'JINJA_PARAM' not in globals() or not JINJA_PARAM:
LOGGER.error("Undefined JINJA_PARAM")
sys.exit(1)
if 'VAULT_PARAM' not in globals() or not VAULT_PARAM:
LOGGER.error("Undefined VAULT_PARAM")
sys.exit(1)
try:
LOGGER.info(f'SIMPLE_ENV_VAR: {os.environ["SIMPLE_ENV_VAR"]}')
LOGGER.info(f'JINJA_ENV_VAR: {os.environ["JINJA_ENV_VAR"]}')
LOGGER.info(f'VAULT_ENV_VAR: {os.environ["VAULT_ENV_VAR"]}')
LOGGER.info(f'SIMPLE_PARAM: {SIMPLE_PARAM}')
LOGGER.info(f'JINJA_PARAM: {JINJA_PARAM}')
LOGGER.info(f'VAULT_PARAM: {VAULT_PARAM}')
LOGGER.info('EMBEDDED JINJA: {{basic_example_node.RECIPE_VAR}}')
success = True
except Exception as e:
LOGGER.error(f'Failed to read and log variables:\n{traceback.format_exc()}')
success = False
dict_datasource.json¶
The dict_datasource.json file defines your source connection as a dictionary data source and identifies a bucket-name,
which is helpful if sharing a resource across nodes. It uses a Jinja expression to define the file load.
Using the Node Editor in the UI, you would define the same values in the Source Connections section of the Connections tab and in the Source > JSON Value and Container > Target File Path fields of the Inputs tab.
{
"type": "DKDataSource_Dictionary",
"name": "dict_datasource",
"bucket-name": "shared-data",
"keys": {
"resources_basic": "{{load_text('python_scripts/resources_basic.py')}}"
}
}
notebook.json¶
The notebook.json file holds the container configuration and test settings. The container-input-file-keys values define
the target path in this example. If not defined, the default destination within the container is the docker-share directory.
{
"image-repo": "{{dockerhubConfig.image_repo.general_purpose}}",
"image-tag": "{{dockerhubConfig.image_tag.general_purpose}}",
"dockerhub-namespace": "{{dockerhubConfig.namespace.general_purpose}}",
"dockerhub-username": "{{dockerhubConfig.username}}",
"dockerhub-password": "{{dockerhubConfig.password}}",
"container-input-file-keys": [
{
"key": "dict_datasource.resources_basic",
"filename": "resources_basic.py"
}
],
"tests": {
"log_dockerhub_tool_instance": {
"description": "Logs the DockerHub tool instance.",
"action": "log",
"test-variable": "dockerhubConfig",
"type": "test-contents-as-string",
"test-logic": "dockerhubConfig",
"keep-history": true
},
"test-success": {
"description": "Stops the OrderRun if success is False.",
"action": "stop-on-error",
"test-variable": "success",
"type": "test-contents-as-boolean",
"test-logic": "success",
"keep-history": true
}
}
}