Skip to content

What is Observability?

DataKitchen's DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.

When you need rapid and trusted customer insight, you must first reduce your team’s hassles and embarrassment by finding errors and bottlenecks in a ‘Mission Control Center’ for all your organization’s data and analytic production problems.


What Is It? Data Observability Open Source Software

  • Visibility Into Every Tool On Your Data's Journey - See every step in the data journey, from data source to value delivery. You can see in-depth into your tools, data, infrastructure, and organizational boundaries.
  • Production Status and Alerts - Quickly monitor the status of every tool, pipeline, software, and data set used during data and analytic production.
  • See All Events At A Glance - Gather logs, metrics, schedules, status and data tests, and all events in a single place.
  • Create Rule-Based Alerts - Deliver specific, timely, personalized exception alerts to email, Slack, Teams, Jira, and other tools.
  • A User Interface For Everyone - The user interface is easy to understand, allowing everyone on the team—IT, managers, data engineers, scientists, analysts, and your business customers—to be on the same page.
  • Simple Integrations and an Open API - Pre-built, fast, easy integrations, and open API drive quick implementations without replacing existing tools.
  • Get Robust AI-driven Data Quality Checks - Simple integration with Open Source DataOps Data Quality TestGen.

Install Observability

Install on Mac / Linux

Recommended Docker install

Install on Windows

Recommended Docker install


Six Observability Touchpoints

Observability lets you see every tool, technology, and data set that is part of your data analytic team's production process. You can watch a short demonstration from this link.

An Observability step-through follows these six touchpoints: Dashboard, Data Journey, Components, Events, Rules, Integrations Agents, and API.

Step 1:  Data Journey Dashboard

Observability provides complete visibility in a simple dashboard to see all the production jobs, pipelines, and status across your complete organization—dozens of tools, thousands of jobs, and tons of data all in one place.

()

Step 2:  The Data Journey

A journey represents a collection of components in your data estate that are responsible for creating a data analytic deliverable.

The components that make up a data journey are often loosely related, spread across different tools, and can be managed by different teams. Observability lets you create a holistic, end-to-end view of the components and relationships that make up a journey in your data estate.

()

Step 3: Components of a Data Journey

Components represent the resources, engines, and tools you use daily to deliver data analytic assets. Components can include batch pipeline runs executed by orchestrators, streaming pipelines in event-driven systems, datasets like database tables and files, and storage or computing infrastructure.

Every event received—or created—by the system is associated with a specific component.

()

Step 4: Events

An event is any moment of interest that occurs in your data estate. For example, an event could be a batch pipeline run starting, a streaming pipeline receiving events with a certain status, a test failing, or a file arriving.

Observability ingests these events, associates them with the correct project and component, uses them to track progress, compares them against expectations, displays their results, and generates indicators and notifications as needed.

()

()

Step 5:  Alert Rules

Use rules to define and capture meaningful incidents and results so you can stay up-to-date and react to situations as they occur.

Observability provides patterns to help construct meaningful rules based on a standard trigger-condition-action format.

()

Step 6: Integration Agents and API

Observability receives, aggregates, and displays event information sent by your tools and data assets (i.e. components). The system does so by exposing a REST API—the Event Ingestion API—where the events are received and ingested in real-time.

An efficient way to integrate with Observability is to use agents for complex data estates with multiple components and assets. Agents can scrape events from event streams, log files, or tool APIs and send the events to Observability for you.

()

()


System details

Observability is available as open-source (Apache 2.0 license) and enterprise software. The open-source version is fully functional for a single data engineer, while the enterprise version has additional features for teams and enterprises.

The software has a browser-based user interface (UI), REST APIs, and a logic and rules engine. All user tasks—including rule creation, setting schedules, or making conditions—can be performed in the UI and are covered in the Observability Help documentation.

(Diagram of Observability relationship to data estate)

Get started with Observability

See Get Started for information on how to install and set up Observability.

Additional resources

  • Core Concepts – Understand data estates, journeys, and integrations
  • Events – Learn about the events Observability ingests and tracks
  • Components – Manage pipelines, datasets, and servers
  • Rules – Define rule-based alerts and notifications
  • Instances – Track and investigate journey instances
  • Integration Agents – Connect your tools to Observability