Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol.
Go to sitePuppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode on Chromium on Pipedream.
Using Puppeteer you can perform tasks including:
import { puppeteer } from '@pipedream/browsers';
export default defineComponent({
async run({steps, $}) {
const browser = await puppeteer.browser();
// Interact with the web page programmatically
// See Puppeeter's Page documentation for available methods:
// https://pptr.dev/api/puppeteer.page
const page = await browser.newPage();
await page.goto('https://pipedream.com/');
const title = await page.title();
const content = await page.content();
// The browser needs to be closed, otherwise the step will hang
await browser.close();
return { title, content }
},
})
No authentication is required to use Puppeteer in your Pipedream workflows. Pipedream publishes a specific NPM package that is compatible with the Pipedream Execution Environment. This package includes the headless Chromium binary needed to run a browser headlessly within your Pipedream workflows.
Simply import this package, launch a browser and navigate using a Puppeteer Page instance.
To get started, import the @pipedream/browsers
package into your Node.js code step. Pipedream will automatically install this specialized package that bundles the dependencies needed to run puppeteer
in your code step.
This package exports a puppeteer
module that exposes these methods:
browser(opts?)
- method to instantiate a new browser (returns a Puppeteer Browser instance)launch(opts?)
- an alias to browser()newPage()
- creates a new Puppeteer Page instance and returns both the page & browserNote: After awaiting the browser instance, make sure to close the browser at the end of your Node.js code step.
import { puppeteer } from '@pipedream/browsers';
export default defineComponent({
async run({steps, $}) {
const browser = await puppeteer.browser();
console.log(browser)
// get page, perform actions, etc.
await browser.close();
},
})
The same @pipedream/browsers
package can be used in actions as well as sources.
The steps are the same as usage in Node.js code. Open a browser, create a page, and close the browser at the end of the code step.
Please note: At this time Source's memory are not configurable and are fixed to 256 mb. This is below the recommened 2 gbs for usage in workflows.
Get the HTML of a webpage using Puppeteer. See the documentation for details.
Get the title of a webpage using Puppeteer. See the documentation
Captures a screenshot of a page using Puppeteer. See the documentation
Remember to close the browser instance before the step finishes. Otherwise, the browser will keep the step "open" and not transfer control to the next step.
For best results, we recommend increasing the amount of memory available to your workflow to 2 gigabytes. You can adjust the available memory in the workflow settings.