Databricks is the lakehouse company, helping data teams solve the world’s toughest problems.
Go to siteThe Databricks API allows you to interact programmatically with Databricks services, enabling you to manage clusters, jobs, notebooks, and other resources within Databricks environments. Through Pipedream, you can leverage these APIs to create powerful automations and integrate with other apps for enhanced data processing, transformation, and analytics workflows. This unlocks possibilities like automating cluster management, dynamically running jobs based on external triggers, and orchestrating complex data pipelines with ease.
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
databricks: {
type: "app",
app: "databricks",
}
},
async run({steps, $}) {
return await axios($, {
url: `https://${this.databricks.$auth.domain}.cloud.databricks.com/api/2.0/clusters/list`,
headers: {
Authorization: `Bearer ${this.databricks.$auth.access_token}`,
},
})
},
})
Automated Cluster Management: Set up workflows on Pipedream to monitor cluster performance metrics and automatically scale clusters up or down based on predefined rules. This can help optimize costs and ensure performance without manual intervention.
Dynamic Job Triggering with GitHub: Create a workflow that triggers a Databricks job whenever a new commit is pushed to a specific GitHub repository. This can be used for continuous integration and deployment (CI/CD) of data processing tasks, such as ETL jobs or machine learning model training.
Event-Driven Data Pipelines with Amazon S3: Construct a serverless data pipeline on Pipedream that kicks off a Databricks job when a new file is uploaded to an Amazon S3 bucket. Use this workflow to process and analyze data in near-real-time, enabling quicker insights and decision-making.
Retrieve the output and metadata of a single task run. See the documentation
Run a job now and return the id of the triggered run. See the documentation
Databricks uses API keys for authentication. When you connect your Databricks account, Pipedream securely stores the keys so you can easily authenticate to Databricks APIs in both code and no-code steps.
To retrieve your Personal Access Token:
Your domain is 1234
if your Databricks instance URL is https://1234.cloud.databricks.com/
, except for the Accounts API whose domain is account
regardless of instance.