Skip to content

Recipe Nodes

Recipe nodes are isolated steps within a variation graph. Nodes are modular, mini-processes that perform work during an order run. To view all nodes that exist in a recipe, select the Nodes tab, or, to see the nodes in a particular variation, open that variation to view its graph.

Recipe nodes and their containers are defined and separated by the specific functions they are created to do. Multiple tools can be used within a single recipe node container. For example, a recipe node can include all of the tools, settings, data, code, and variables needed to complete the process of collecting a dataset, creating a database schema, creating a database table, and populating the new table with the collected data for downstream querying.

Tip

DataOps best practice: Reuse and containerize. Containerization of recipe graph nodes lets you use your favorite tools while working with Automation. Because recipe node processing occurs within a container, the node can run tools in isolation from other nodes—and that node is guaranteed to run exactly the same anywhere it is deployed.

Node Editor

Set up and configure nodes from the Node Editor. To open the Node Editor for any node, select a specific kitchen > recipe > variation, then:

  • Select a node in the graph. Click Edit Node in the Node Details.
  • Double-click a node in the graph.

Once in the Node Editor, the configuration and settings are available in several tabs.

Note

Node Editor is not available for advanced Jinja. The Node Editor cannot parse any node with JSON files using advanced Jinja that break standard JSON format. In this case, the File Editor will open for editing nodes.

Node types

Automation provides six types of nodes, each designed to achieve distinct work. Each node type is identified by the type property in its description.json file.

Node type Type property value Notebook? Data sources/sinks? Description
Synchronize DKNode_NoOp file required; content optional —/— A node without any objects (key/value pairs). These nodes run without configured data sources or sinks. They are often used as a convergence node for graphs with parallel nodes upstream.

See Synchronize Nodes for configuration information.
DataMapper DKNode_DataMapper optional required/required Maps data between data sources and data sinks. Source and target files can be mapped by explicitly configuring filenames, and sets of files can be mapped using wildcards.

See Configure DataMapper Nodes for more information.
Action DKNode_Action required required/— Runs data sources to connect to infrastructure for use cases where data sinks are not needed. For example, connecting to a database to perform administrative operations. The /data-sources directory is named /actions for action nodes.

See Configure Action Nodes for more information.
Container DKNode_Container required optional/optional Runs a Docker container based on parameterizable image names and tags. Code (using scripts and calling various tools) can be embedded and run within these nodes. DataKitchen provides a number of container images with useful features that may be leveraged directly or customized. The standard image is the General Purpose Container.

See Create a Container Node and related topics for more information.
Ingredient DKNode_Ingredient required —/— Runs a recipe variation that has been declared as an ingredient. A distinct order run is created for the ingredient node, which is run in an auto-generated child kitchen.

See Configure Ingredient Nodes for more information.
Conditional DKNode_NoOp file required; content optional —/— Serves as an operational decision point that determines the flow of graph processing.

See Create a Conditional Node and Conditional Node Processing for more information.