Generate Content from Text and Image with Google Gemini API

Pipedream makes it easy to connect APIs for Google Gemini and 2,000+ other apps remarkably fast.

Trigger workflow on

HTTP requests, schedules and app events

Next, do this

Generate Content from Text and Image with the Google Gemini API

No credit card required

▶

Watch us build a workflow

8 min

Watch now ➜

Trusted by 800,000+ developers from startups to Fortune 500 companies

Developers ♥ Pipedream

Getting Started#

Create a workflow to Generate Content from Text and Image with the Google Gemini API. When you configure and deploy the workflow, it will run on Pipedream's servers 24x7 for free.

Configure the Generate Content from Text and Image action
1. Connect your Google Gemini account
2. Configure Prompt Text
3. Select a MIME Type
4. Configure Image File Paths
Select a trigger to run your workflow on HTTP requests, schedules or app events
Deploy the workflow
Send a test event to validate your setup
Turn on the trigger

Integrations#

Generate Content from Text and Image with Google Gemini API on New Field from Airtable (OAuth) API

Airtable (OAuth) + Google Gemini

Try it

Generate Content from Text and Image with Google Gemini API on New Requests (Payload Only) from HTTP / Webhook API

HTTP / Webhook + Google Gemini

Try it

Generate Content from Text and Image with Google Gemini API on New Submission from Typeform API

Typeform + Google Gemini

Try it

Generate Content from Text and Image with Google Gemini API on Custom Events from Zoom API

Zoom + Google Gemini

Try it

Generate Content from Text and Image with Google Gemini API on New Submission (Instant) from Jotform API

Jotform + Google Gemini

Try it

Details#

This is a pre-built, source-available component from Pipedream's GitHub repo. The component is developed by Pipedream and the community, and verified and maintained by Pipedream.

To contribute an update to an existing component or create a new component, create a PR on GitHub. If you're new to Pipedream component development, you can start with quickstarts for trigger span and action development, and then review the component API reference.

Generate Content from Text and Image on Google Gemini

Description:Generates content from both text and image input using the Gemini API. [See the documentation](https://ai.google.dev/tutorials/rest_quickstart#text-and-image_input)

Version:0.0.1

Key:google_gemini-generate-content-from-text-and-image

View on GitHub

Code#

import fs from "fs";
import { ConfigurationError } from "@pipedream/platform";
import app from "../../google_gemini.app.mjs";
import constants from "../../common/constants.mjs";

export default {
  key: "google_gemini-generate-content-from-text-and-image",
  name: "Generate Content from Text and Image",
  description: "Generates content from both text and image input using the Gemini API. [See the documentation](https://ai.google.dev/tutorials/rest_quickstart#text-and-image_input)",
  version: "0.0.1",
  type: "action",
  props: {
    app,
    text: {
      propDefinition: [
        app,
        "text",
      ],
    },
    mimeType: {
      propDefinition: [
        app,
        "mimeType",
      ],
    },
    imagePaths: {
      propDefinition: [
        app,
        "imagePaths",
      ],
    },
  },
  methods: {
    fileToGenerativePart(path, mimeType) {
      if (!path) {
        return;
      }
      return {
        inline_data: {
          mime_type: mimeType,
          data: Buffer.from(fs.readFileSync(path)).toString("base64"),
        },
      };
    },
  },
  async run({ $ }) {
    const {
      app,
      text,
      imagePaths,
      mimeType,
    } = this;

    if (!Array.isArray(imagePaths)) {
      throw new ConfigurationError("Image paths must be an array.");
    }

    if (!imagePaths.length) {
      throw new ConfigurationError("At least one image path must be provided.");
    }

    const response = await app.generateContent({
      $,
      modelType: constants.MODEL_TYPE.GEMINI_PRO_VISION,
      data: {
        contents: [
          {
            parts: [
              {
                text,
              },
              ...imagePaths.map((path) => this.fileToGenerativePart(path, mimeType)),
            ],
          },
        ],
      },
    });

    $.export("$summary", "Successfully generated content from text and image.");

    return response;
  },
};

Configuration#

This component may be configured based on the props defined in the component code. Pipedream automatically prompts for input values in the UI and CLI.

Label	Prop	Type	Description
Google Gemini	`app`	`app`	This component uses the Google Gemini app.
Prompt Text	`text`	`string`	The text to use as the prompt for content generation
MIME Type	`mimeType`	`string`	Select a value from the drop down menu:`{ "label": "PNG", "value": "image/png" }{ "label": "JPEG", "value": "image/jpeg" }`
Image File Paths	`imagePaths`	`string[]`	The local file paths of the images to use in the content generation. The path to the image file saved to the `/tmp` directory (e.g. `/tmp/example.pdf`). See the documentation.