with WebScraper.IO and FireCrawl?
Emit new event when a page scraping job has completed. See the docs here
Creates a scraping job (scrapes a sitemap). See the docs here
Crawls a given URL and returns the contents of sub-pages. See the documentation
Creates a sitemap for the selected website. See the docs here
Extract structured data from one or multiple URLs. See the documentation
Retrieves a list of scraping jobs for a sitemap. See the docs here
The WebScraper.IO API allows you to programmatically perform web scraping tasks, extracting structured data from websites. With the API, you can automate the gathering of web content for analysis, monitoring, and integration with other data sources. In Pipedream, you can leverage this API to build workflows that process, analyze, and act on the data you scrape without writing code for backend infrastructure.
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
webscraper_io: {
type: "app",
app: "webscraper_io",
}
},
async run({steps, $}) {
return await axios($, {
url: `https://api.webscraper.io/api/v1/sitemaps`,
params: {
api_token: `${this.webscraper_io.$auth.api_key}`,
},
})
},
})
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
firecrawl: {
type: "app",
app: "firecrawl",
}
},
async run({steps, $}) {
const data = {
"url": "https://pipedream.com",
}
return await axios($, {
method: "post",
url: `https://api.firecrawl.dev/v0/crawl`,
headers: {
Authorization: `Bearer ${this.firecrawl.$auth.api_key}`,
},
data,
})
},
})