Streaming Pipeline¶

A streaming pipeline is a type of Observability component that represents a series of ongoing processes running in a data estate.

Observability pipelines help to group analytic processes, streamline monitoring and evaluation, and improve the quality of delivering a data asset.

Pipeline types¶

In Observability, there are two types of pipeline components:

Streaming pipeline: this component represents an infinitely-running, event-based workflow. For example, a real-time Apache Kafka process.
Batch pipeline: this component represents a batch process with recurring runs. For example, an Airflow DAG or a DataKitchen DataOps Automation recipe variation.

Streaming pipeline configuration¶

You define the component activity to align with the streaming pipelines in your data estate.

Streaming pipeline configurations always include events, which are moments of interest sent by your data estate and captured by Observability.

Create a new streaming pipeline¶

Streaming pipelines are automatically generated based on event information received by the Event Ingestion API.

Observability creates a new streaming pipeline any time the system receives an event whose stream_key does not match keys for existing streaming pipelines in the project.

You can also manually create streaming pipelines in the UI. Follow the instructions outlined in Create Components.