Databricks (Jobs) block

The databricks_job_trigger_and_status_check block triggers a pre-existing Databricks notebook job and monitors its completion status. It acts as a transformation block in a Connect+ dataflow, used to validate and process incoming data before passing it to downstream blocks. The block authenticates with the Databricks API using a personal access token, stores processed files at a specified path, and retries status checks until the job completes or reaches the maximum retry limit.

When to use this block

Use this block when your dataflow requires a Databricks job to fully complete, and its output to be available, before downstream processing can continue. Choose the Databricks Jobs block when you only need to trigger a Databricks job.

Prerequisites

Before configuring this block, make sure you have:

  • The Databricks job ID
  • A Databricks personal access token registered in Connect+
  • The Databricks API URL
  • A file path where Databricks stores its output files

Configuration fields

Field name

Required

Description

Block name

No

A name for the block instance. The name must be alphanumeric. There is no character limit.

Job ID

Yes

The ID of the Databricks job to trigger.

Max Retries

Yes

The maximum number of times the system checks the job status before marking the file as failed.

dataBricks Processed Files Path

Yes

The destination file path where Databricks stores the output files after executing the job, regardless of whether the job succeeds or fails.
For example, /processed/output/.

dataBricksToken

Yes

The personal access token used to authenticate with the Databricks API. Select the token from the dropdown.

dataBricksUrl

Yes

The Databricks API URL used to trigger and monitor the job.
For example, https://capillary-scrum.cloud.databricks.com/api/2.1/jobs.