with Automatic Data Extraction and ZenRows?
Extract data from a specified URL See the docs here
Scrape HTML of the URL with CSS Selectors. See the documentation
The Automatic Data Extraction API by Zyte specializes in extracting structured data from web pages. When incorporated into Pipedream workflows, this API allows you to automate the process of gathering web data, which can feed into various tasks such as market research, price monitoring, or even lead generation. By triggering workflows with new data inputs, processing and storing the extracted data, and connecting to other apps, Pipedream amplifies the API's utility.
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
automatic_data_extraction: {
type: "app",
app: "automatic_data_extraction",
}
},
async run({steps, $}) {
const data = JSON.stringify([{
'url': 'http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html',
'pageType': 'product',
}]);
return await axios($, {
method: "post",
url: `https://autoextract.scrapinghub.com/v1/extract`,
headers: {
"Content-Type": `application/json`,
},
auth: {
username: `${this.automatic_data_extraction.$auth.api_key}`,
password: ``,
},
data,
})
},
})
ZenRows API specializes in web scraping and handles issues like CAPTCHAs, JavaScript rendering, and rotating proxies to ensure successful data extraction. In Pipedream, you can pair the ZenRows API with numerous other services to create automated workflows that respond to events, process and analyze scraped data, or even trigger actions based on the data collected. Whether you need to monitor changes on web pages, aggregate content for analysis, or feed scraped data into other applications, ZenRows' integration on Pipedream simplifies these tasks.
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
zenrows: {
type: "app",
app: "zenrows",
}
},
async run({steps, $}) {
return await axios($, {
url: `https://api.zenrows.com/v1/`,
params: {
apikey: `${this.zenrows.$auth.api_key}`,
url: `https://httpbin.io/anything`,
},
})
},
})