Fivetran Log Connector Community¶
DataKitchen provides an agent that lets you monitor Fivetran Log Connector with a Databricks destination .
Fivetran overview¶
Fivetran allows you to export your connector logs to one of your Fivetran destinations through the Fivetran Log Connector. The frequency that Fivetran syncs the logs to your destination varies depending on the Fivetran Log Connector frequency setting and the Fivetran plan that you've purchased. The standard plan has a lower limit of 1 hour between syncs, while the Enterprise version can be set to as low as 5 minutes.
Deployment details¶
When the Observability Fivetran Agent starts for the first time, it will look for events as far back as 10 minutes (by default). After this, the agent will look back as far as the last Fivetran sync.
The Observability FiveTran agent creates an Observability component with the prefix FiveTranLogsFetcher. This component can be ignored.
Deploy¶
Prerequisites
- Docker with Docker Compose. If not installed, see Get Docker .
- A personal access token for the Databricks instance is required. If you do not have one, refer to Databricks personal access token authentication .
- An API key for this agent — see step 5 below to create one.
Steps
- Log into Observability and select a project.
- Select Integrations from the menu, then click View Available Agents.
- Select the tool from the list.
- Under Step 1: Prerequisites, verify any requirements for the tool have been completed.
- In the New API Key section, create an API key for this agent.
- Enter a name, expiration, and description.
- Configure the key to send events, manage entities, and transmit heartbeat.
- Click Create Key and Continue.
-
Under Step 2: Configuration, fill in any remaining variables as needed.
- Required values are noted by an asterisk.
- Some values are pre-populated with project-specific configuration details.
-
Click Continue.
- Under Step 3: Deploy, enter the agent's Image Tag (format:
vx.x.x). - Select Docker as the Deployment Location.
- Click Download Script.
- Save the file anywhere it can be accessed by Docker.
-
Open a terminal and run the deployment script:
Prerequisites
- A Kubernetes cluster with kubectl access.
- A personal access token for the Databricks instance is required. If you do not have one, refer to Databricks personal access token authentication .
- An API key for this agent — see step 5 below to create one.
Steps
- Log into Observability and select a project.
- Select Integrations from the menu, then click View Available Agents.
- Select the tool from the list.
- Under Step 1: Prerequisites, verify any requirements for the tool have been completed.
- In the New API Key section, create an API key for this agent.
- Enter a name, expiration, and description.
- Configure the key to send events, manage entities, and transmit heartbeat.
- Click Create Key and Continue.
-
Under Step 2: Configuration, fill in any remaining variables as needed.
- Required values are noted by an asterisk.
- Some values are pre-populated with project-specific configuration details.
-
Click Continue.
- Under Step 3: Deploy, enter the agent's Image Tag (format:
vx.x.x). - Select Kubernetes as the Deployment Location.
- Click Download Script.
- Copy the script to a machine with access to the Kubernetes cluster.
-
Open a terminal and run the deployment script:
Configuration variables¶
| Variable | Required? | Description | Value |
|---|---|---|---|
| EVENTS_API_HOST | Yes | The base API URL for the Observability instance. | Enterprise: https://api.datakitchen.ioDocker/minikube: http://<minikube_ip>:8082/apiDocker/localhost: http://host.docker.internal:8082/apiK8s/minikube: http://observability-ui.datakitchen.svc.cluster.local:8082/api |
| EVENTS_API_KEY | Yes | An API key for the Observability project. A key unique to this agent is recommended. Provides authorization for events and agent heartbeat. | |
| EVENTS_PROJECT_ID | Yes | The current project ID. Found in your Observability URL. | https://example-account.datakitchen.io/projects/{the-project-id}/ |
| EXTERNAL_PLUGINS_PATH | Yes | The location of the plugin. Retain the default value. | /plugins |
| PUBLISH_EVENTS | Yes | Indicates to the system events are being published by the external tool. Retain the default value. | True |
| POLLING_INTERVAL_SECS | Yes | The frequency at which the agent captures events. Retain the default value. | 10 |
| LOGGING_MODE | Optional | Use to set the logging level. Accepts values DEBUG or INFO. | INFO |
| MAX_WORKERS | Optional | Sets the number of workers available. Accepts integer values. Default value is 10. Set to 1 for Databricks agents to avoid rate limit errors. |
10 |
Agent-specific variables¶
| Variable | Required? | Description | Value |
|---|---|---|---|
| ENABLED_PLUGINS | Yes | The plugin that represents and specifies the agent. Retain the default value. Accepts a comma-separated list of plugins if you have more than one. | fivetran_log_to_databricks |
| FIVETRAN_DB_SERVER_HOSTNAME | Yes | The hostname of the Databricks SQL interface. | dbc-XXXXXX-d6e7.cloud.databricks.com |
| FIVETRAN_DB_HTTP_PATH | Yes | The HTTP path to either a Databricks SQL endpoint or a Databricks Runtime interactive cluster. | SQL endpoint: /sql/1.0/endpoints/1234567890abcdefRuntime cluster: /sql/protocolv1/o/1234567890123456/1234-123456-slid123 |
| FIVETRAN_DB_LOG_SCHEMA | Yes | The schema in Databricks where the Fivetran Log Connector sends logs. This was set when the Fivetran Log Connector was set up in Fivetran. Retain the default value. | fivetran_log |
| FIVETRAN_DB_PERSONAL_ACCESS_TOKEN | Yes | The Databricks personal access token that has access to the Fivetran log schema. | |
| FIVETRAN_DB_LOOKBACK | Yes | The maximum number of minutes the agent looks back for new events at startup on a new install. Also enforces a maximum lookback period for agents already initialized. Default is 10 minutes. | FIVETRAN_DB_LOOKBACK=10 |
Tip
A short lookback period is best when Fivetran has a large number of high-frequency jobs.