Publish Events Directly¶
The following walks you through how you use the Event Ingestion API endpoints directly.
How to publish events¶
Prerequisites¶
To use the Event Ingestion API, an API key is required.
Send events¶
- Navigate to the API Swagger documentation: https://api.docs.datakitchen.io/production/events.html
- Expand Events > in the menu.
- Select an event endpoint.
- Expand the AUTHORIZATIONS section for header information on where the API key should be applied.
-
Enter the required and optional key-value pairs into your script.
Descriptions, parameters, and request and response formats are provided for each endpoint.
Walkthrough¶
Use case¶
This batch pipeline migrates data from an SFTP server to an S3 bucket. It creates a table for storing the data in Redshift, then uploads the data into Redshift in an ETL step. Finally, it publishes a Tableau workbook for visualizing the data.

When this operation runs, a user may be interested in the following information:
- When did the batch pipeline start running?
- What tasks are actively running? What tasks finished running and did any fail?
- Are there any error or warning log messages?
- What tests have been executed and what are the results?
Once the appropriate events are published to the Event Ingestion API, Observability can provide insight and answers to these questions.
Publish a Run Status event¶
A Run Status event is the first event type published when the batch pipeline begins. This indicates that a run has started.
See Event Ingestion API: RunStatus for complete technical details for this event type.
-
For the first POST to the run status endpoint:
- Include the
pipeline_keyandrun_key. - In this example, an appropriate
pipeline_keywould be SFTP_server_to_Tableau_workbook and you can use the Airflow Run ID as therun_keyvalue. - Include a
statusin the request body set to RUNNING.
- Include the
-
For the second POST to the run status endpoint:
- Add an
event_timestamp,task_key, andstatusin the request body. - In this example, each step of the DAG represents a task. An appropriate
task_keywould be SFTP_to_S3,statusis RUNNING, andevent_timestampshould reflect when the task started running. - If no timestamp is set, the Event Ingestion API applies its current time to the field.
- Add an
-
When the task is finished running, publish another Run Status event similar to that from step 2, but with a
statusCOMPLETED, COMPLETED_WITH_WARNINGS, or FAILED. - As the subsequent tasks in the pipeline execute, they should publish their own Run Status events, similar to that from step 2.
- When the run is finished, publish another Run Status event similar to that from step 1, but with a
statusCOMPLETED, COMPLETED_WITH_WARNINGS, or FAILED to close the run.
Note
Run start and end: the status key sets the status for both runs and tasks. A run starts when no task_key is present and
the status is "running." A run finishes when no task_key is present and the status is not "running."
The system automatically creates a new run for any event sent with a unique
pipeline_key and run_key.
Publish a Message Log event¶
When a task in the batch pipeline run fails, logs are typically the starting point for fixing any failures. Therefore, it's helpful to publish relevant log messages as events to give context in the UI.
See Event Ingestion API: MessageLog for complete technical details for this event type.
- A POST to the message log endpoint requires
log_level,message,run_key, andpipeline_keyvalues. - Optionally, include a
task_keyto associate the message with a specific task, or node. - Each node should publish its respective warning and error logs.
Tip
Although you can publish every log message for a pipeline run, DataKitchen strongly discourages this practice. Best practice is to publish only warning and error log messages that provide context for fixing failures and a limited number of useful info logs.
Publish a Test Outcomes event¶
An important part of DataOps methodology is running tests. Ideally, each node in the pipeline should execute tests to validate that they have run successfully before the batch pipeline run progresses to subsequent nodes. Therefore, each task should publish a TestOutcomes event with corresponding test results.
See Event Ingestion API: TestOutcomes for complete technical details for this event type.
- A POST to the test outcomes endpoint requires a
run_key,pipeline_key, andtest_outcomesarray of results. - Optionally, include a
task_keyto associate the results with a specific task (i.e. node). - Each node should publish its respective test results.
Publish a Metric Log event¶
Staying up-to-date with data values as they move through a pipeline is a good way to ensure data remains within expectations. Each node that transforms, aggregates, or manipulates data should post a log with relevant information about the value. Publish a MetricLog event to monitor data as it moves through the batch pipeline.
See Event Ingestion API: MetricLog for complete technical details for this event type.
- A POST to the metric log endpoint requires a
metric_value,pipeline_key,metric_key, andrun_key. - Optionally, include a
task_keyto associate the results with a specific task (i.e. node). - Each node should publish its respective metric log.