Skip to content

General Purpose Container

Automation container nodes work by referencing container images that become Docker containers at runtime, allowing you to build recipes with processes that are containerized and portable.

DataKitchen provides a pre-configured container node called the General Purpose Container (GPC). The GPC is available to use as a base container or to reference when building your own custom containers.

The GPC:

  • Sources existing assets, configurations, and settings.
  • Uses Python3 exclusively.
    • Versions of the GPC specific to python3.9 and python3.10 are available for use, noted as v0.###_python3.9 and v0.###_python3.10.
  • Is set up to handle many basic use cases, such as running Python or shell scripts, provisioning with Ansible, and generating data visualizations in Tableau.
  • Supports running .sh, .py, .and ipynb scripts located in the node's docker-share directory.
  • Has pre-installed tools, such as pandas and NumPy, to perform common data analysis actions or work with cloud computing services.
  • Supports lists of parameters passed into the container through the config.json file.
  • Defines a standard structure for installing dependencies, passing parameters and variables, retrieving files, and logging.

Tip

Given its configuration, the GPC allows users to start fast in DataOps and helps users leverage Docker containers without having any prior expertise. Custom code doesn't have to be written and users aren't required to make complex set-up decisions.

Get the GPC

The GPC is publically available as a container image on Docker Hub open_in_new.

Pre-installed packages

The GPC comes with several pre-installed packages. For the full list, see GPC Pre-Installed Packages.

You can reference any of the pre-installed apps in your scripts. Additionally, the GPC configuration lets you specify other apt-get and Python packages to be installed at order runtime, allowing you to iterate fast without having to build new containers.

Container configuration

When you add any container node to a recipe graph—either in the user interface or the command line interface—you must configure the connection to use the correct GPC parameters.

For information on how to, see GPC File Structure and Configuration.

Create a Container Node

Vault Secrets in the GPC

GPC Resource Allocation

GPC Image Changelog