← Google Cloud

BigQuery - New Row from Google Cloud API

Pipedream makes it easy to connect APIs for Google Cloud and + other apps remarkably fast.

Trigger workflow on
BigQuery - New Row from the Google Cloud API
Next, do this
Connect to 500+ APIs using code and no-code building blocks
No credit card required
Trusted by 200,000+ developers from startups to Fortune 500 companies:
Trusted by 200,000+ developers from startups to Fortune 500 companies

Developers Pipedream

Getting Started

Trigger a workflow on BigQuery - New Row with Google Cloud API. When you configure and deploy the workflow, it will run on Pipedream's servers 24x7 for free.

  1. Configure the BigQuery - New Row trigger
    1. Connect your Google Cloud account
    2. Configure timer
    3. Configure Event Size
    4. Select a Dataset
    5. Select a Table Name
    6. Select a Unique Key
  2. Add steps to connect to + APIs using code and no-code building blocks
  3. Deploy the workflow
  4. Send a test event to validate your setup
  5. Turn on the trigger

Details

This is a pre-built, open source component from Pipedream's GitHub repo. The component is developed by Pipedream and the community, and verified and maintained by Pipedream.

To contribute an update to an existing component or create a new component, create a PR on GitHub. If you're new to Pipedream component development, you can start with quickstarts for trigger span and action development, and then review the component API reference.

BigQuery - New Row on Google Cloud
Description:Emit an event when a new row is added to a table
Version:0.0.1
Key:google_cloud-bigquery-new-row

Code

const crypto = require("crypto");
const isString = require("lodash/isString");
const common = require("../common/bigquery");

module.exports = {
  ...common,
  key: "google_cloud-bigquery-new-row",
  name: "BigQuery - New Row",
  description: "Emit an event when a new row is added to a table",
  version: "0.0.1",
  dedupe: "unique",
  props: {
    ...common.props,
    tableId: {
      type: "string",
      label: "Table Name",
      description: "The name of the table to watch for new rows",
      async options(context) {
        const { page } = context;
        if (page !== 0) {
          return [];
        }

        const client = this
          .getBigQueryClient()
          .dataset(this.datasetId);
        const [
          tables,
        ] = await client.getTables();
        return tables.map(({ id }) => id);
      },
    },
    uniqueKey: {
      type: "string",
      label: "Unique Key",
      description: `
        The name of a column in the table to use for deduplication. See [the
        docs](https://github.com/PipedreamHQ/pipedream/tree/master/components/google_cloud/sources/bigquery-new-row#technical-details)
        for more info.
      `,
      async options(context) {
        const { page } = context;
        if (page !== 0) {
          return [];
        }

        const columnNames = await this._getColumnNames();
        return columnNames.sort();
      },
    },
  },
  hooks: {
    ...common.hooks,
    async deploy() {
      await this._validateColumn(this.uniqueKey);
      const lastResultId = await this._getIdOfLastRow(this.getInitialEventCount());
      this._setLastResultId(lastResultId);
    },
    async activate() {
      if (this._getLastResultId()) {
        // ID of the last result has already been initialised during deploy(),
        // so we skip the rest of the activation.
        return;
      }

      await this._validateColumn(this.uniqueKey);
      const lastResultId = await this._getIdOfLastRow();
      this._setLastResultId(lastResultId);
    },
    deactivate() {
      this._setLastResultId(null);
    },
  },
  methods: {
    ...common.methods,
    _getLastResultId() {
      return this.db.get("lastResultId");
    },
    _setLastResultId(lastResultId) {
      this.db.set("lastResultId", lastResultId);
      console.log(`
        Next scan of table '${this.tableId}' will start at ${this.uniqueKey}=${lastResultId}
      `);
    },
    /**
     * Utility method to make sure that a certain column exists in the target
     * table. Useful for SQL query sanitizing.
     *
     * @param {string} columnNameToValidate The name of the column to validate
     * for existence
     */
    async _validateColumn(columnNameToValidate) {
      if (!isString(columnNameToValidate)) {
        throw new Error("columnNameToValidate must be a string");
      }

      const columnNames = await this._getColumnNames();
      if (!columnNames.includes(columnNameToValidate)) {
        throw new Error(`Nonexistent column: ${columnNameToValidate}`);
      }
    },
    async _getColumnNames() {
      const table = this
        .getBigQueryClient()
        .dataset(this.datasetId)
        .table(this.tableId);
      const [
        metadata,
      ] = await table.getMetadata();
      const { fields } = metadata.schema;
      return fields.map(({ name }) => name);
    },
    async _getIdOfLastRow(offset = 0) {
      const limit = offset + 1;
      const query = `
        SELECT *
        FROM \`${this.tableId}\`
        ORDER BY \`${this.uniqueKey}\` DESC
        LIMIT @limit
      `;
      const queryOpts = {
        query,
        params: {
          limit,
        },
      };
      const rows = await this.getRowsForQuery(queryOpts, this.datasetId);
      if (rows.length === 0) {
        console.log(`
          No records found in the target table, will start scanning from the beginning
        `);
        return;
      }

      const startingRow = rows.pop();
      return startingRow[this.uniqueKey];
    },
    getQueryOpts() {
      const lastResultId = this._getLastResultId();
      const query = `
        SELECT *
        FROM \`${this.tableId}\`
        WHERE \`${this.uniqueKey}\` > @lastResultId
        ORDER BY \`${this.uniqueKey}\` ASC
      `;
      const params = {
        lastResultId,
      };
      return {
        query,
        params,
      };
    },
    generateMeta(row, ts) {
      const id = row[this.uniqueKey];
      const summary = `New row: ${id}`;
      return {
        id,
        summary,
        ts,
      };
    },
    generateMetaForCollection(rows, ts) {
      const hash = crypto.createHash("sha1");
      rows
        .map(i => i[this.uniqueKey])
        .map(i => i.toString())
        .forEach(i => hash.update(i));
      const id = hash.digest("base64");

      const rowCount = rows.length;
      const entity = rowCount === 1
        ? "row"
        : "rows";
      const summary = `${rowCount} new ${entity}`;

      return {
        id,
        summary,
        ts,
      };
    },
  },
};

Configuration

This component may be configured based on the props defined in the component code. Pipedream automatically prompts for input values in the UI and CLI.
LabelPropTypeDescription
Google Cloudgoogle_cloudappThis component uses the Google Cloud app.
N/Adb$.service.dbThis component uses $.service.db to maintain state between component invocations.
timer$.interface.timer
Event SizeeventSizeinteger

The number of rows to include in a single event (by default, emits 1 event per row)

DatasetdatasetIdstringSelect a value from the drop down menu.
Table NametableIdstringSelect a value from the drop down menu.
Unique KeyuniqueKeystringSelect a value from the drop down menu.

Authentication

Google Cloud uses API keys for authentication. When you connect your Google Cloud account, Pipedream securely stores the keys so you can easily authenticate to Google Cloud APIs in both code and no-code steps.

When you create a service account in GCP, you'll be asked to generate a service account key. Create that key and download the key details in JSON format.

Open the key JSON in a text editor, then copy and paste its contents here.

About Google Cloud

The Google Cloud Platform

About Pipedream

Stop writing boilerplate code, struggling with authentication and managing infrastructure. Start connecting APIs with code-level control when you need it — and no code when you don't.

Into to Pipedream
Watch us build a workflow
Watch us build a workflow
4 min
Watch now ➜
"The past few weeks, I truly feel like the clichéd 10x engineer."
@heyellieday
Powerful features that scale
Manage concurrency and execution rate
Manage concurrency and execution rate

Queue up to 10,000 events per workfow and manage the concurrency and rate at which workflows are triggered.

Process large payloads up to 5 terabytes
Process large payloads up to 5 terabytes

Large file support enables you to trigger workflows with any data (e.g., large JSON files, images and videos) up to 5 terabytes.

Return custom responses to HTTP requests
Return custom responses to HTTP requests

Return any JSON-serializable response from an HTTP triggered workflow using $respond().

Use most npm packages
Use most npm packages

To use any npm package, just require() it -- there's no npm install or package.json required.

Maintain state between executions
Maintain state between executions

Use $checkpoint to save state in one workflow invocation and read it the next time your workflow runs.

Pass data between steps
Pass data between steps

Return data from any step to inspect it in a human-friendly way and reference the data in future steps via the steps object.