Scrape a website on a schedule
@dylan
code:
data:privatelast updated:3 years ago
today
Build integrations remarkably fast!
You're viewing a public workflow template.
Sign up to customize, add steps, modify code and more.
Join 800,000+ developers using the Pipedream platform
steps.
trigger
Cron Scheduler
Deploy to configure a custom schedule
This workflow runs on Pipedream's servers and is triggered on a custom schedule.
steps.
nodejs
auth
to use OAuth tokens and API keys in code via theauths object
code
Write any Node.jscodeand use anynpm package. You can alsoexport datafor use in later steps via return or this.key = 'value', pass input data to your code viaparams, and maintain state across executions with$checkpoint.
async (event, steps) => {
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
}
22
const axios = require("axios")
const cheerio = require("cheerio")

async function fetchHTML(url) {
  const { data } = await axios.get(url)
  return cheerio.load(data)
}

const $ = await fetchHTML("https://example.com")

// Print the full HTML
console.log(`Site HTML: ${$.html()}\n\n`)

// Print some specific page content
console.log(`First h1 tag: ${$('h1').text()}`)

// Store the full HTML as a property of $event so we can 
// use it in later steps. See
// https://docs.pipedream.com/notebook/dollar-event/#modifying-event
this.html = $.html()