This topic was automatically generated from Slack. You can find the original thread here.
guys, I’m trying to use puppeteer in python (pyppeteer) but I’m getting this error: [Errno 30] Read-only file system: '/home/sbx_user1051'
What might be the problem?
Sure, here’s an example of how you can use Puppeteer in Node.js code steps using the @pipedream/browsers package: import { launch } from "@pipedream/browsers"; export default defineComponent({ async run({ steps, $ }) { const browser = await launch(); const page = await browser.newPage(); await page.goto("https://example.com"); const title = await page.title(); console.log(title); await browser.close(); return title; }, }); In this example, we’re launching a new browser instance, opening a new page, navigating to “https://example.com”, getting the title of the page, logging it to the console, and then closing the browser. The title of the page is also returned as the result of the step. Please note that the @pipedream/browsers package is a wrapper around Puppeteer that’s optimized for use in Pipedream’s serverless environment. It provides the same API as Puppeteer, so you can refer to the Puppeteer documentation for more information on how to use it.
given an array of URLs, can you show me how to map over each one and use the puppeeter export from @pipedream/browsers to retrieve the HTML content from each URL in an array of Promises passed to a Promise.all so the step execution will wait until the promises are finished before code control moves to the next step in the workflow?
Sure, here’s an example of how you can use the Puppeteer export from @pipedream/browsers to retrieve the HTML content from each URL in an array of Promises passed to a Promise.all. This will ensure that the step execution waits until all the promises are finished before moving to the next step in the workflow: import { launch } from "@pipedream/browsers"; export default defineComponent({ async run({ steps, $ }) { const websites = ['https://www.google.com', 'https://www.facebook.com']; const browser = await launch(); const promises = websites.map(async (url) => { const page = await browser.newPage(); await page.goto(url); return await page.content(); }); const htmlContents = await Promise.all(promises); await browser.close(); return htmlContents; }, }); In this example, we’re launching a new browser instance, mapping over each URL in the websites array, opening a new page for each URL, navigating to the URL, getting the HTML content of the page, and adding the Promise to the promises array. We then use Promise.all to wait for all the Promises in the promises array to resolve, storing the resulting array of HTML contents in htmlContents. Finally, we close the browser and return htmlContents as the result of the step.