Extract Web Data on Scale
Create a new document in a collection of your choice. See the docs here
Execute an aggregation pipeline on a MongoDB collection. See the documentation
ScrapeNinja API on Pipedream allows you to craft powerful serverless workflows for web scraping without the hassle of managing proxies or browsers. It's a tool that can extract data from websites, handling JavaScript rendering and anti-bot measures with ease. By integrating ScrapeNinja with Pipedream, you can automate data collection, collate and process the scraped data, and connect it to numerous other services for further analysis, alerting, or storage.
import { axios } from '@pipedream/platform';
export default defineComponent({
props: {
scrapeninja: {
type: "app",
app: "scrapeninja",
}
},
async run({steps, $}) {
return await axios($, {
method: 'POST',
url: 'https://scrapeninja.p.rapidapi.com/scrape',
headers: {
'content-type': 'application/json',
'X-RapidAPI-Key': this.scrapeninja.$auth.rapid_api_key,
'X-RapidAPI-Host': 'scrapeninja.p.rapidapi.com'
},
data: {
url:"https://news.ycombinator.com/"
}
})
},
})
The MongoDB API provides powerful capabilities to interact with a MongoDB database, allowing you to perform CRUD (Create, Read, Update, Delete) operations, manage databases, and execute sophisticated queries. With Pipedream, you can harness these abilities to automate tasks, sync data across various apps, and react to events in real-time. It’s a combo that’s particularly potent for managing data workflows, syncing application states, or triggering actions based on changes to your data.
import mongodb from 'mongodb'
export default defineComponent({
props: {
mongodb: {
type: "app",
app: "mongodb",
},
collection: {
type: "string"
},
filter: {
type: "object"
}
},
async run({steps, $}) {
const MongoClient = mongodb.MongoClient
const {
database,
hostname,
username,
password,
} = this.mongodb.$auth
const url = `mongodb+srv://${username}:${password}@${hostname}/test?retryWrites=true&w=majority`
const client = await MongoClient.connect(url, {
useNewUrlParser: true,
useUnifiedTopology: true
})
const db = client.db(database)
const results = await db.collection(this.collection).find(this.filter).toArray();
$.export('results', results);
await client.close()
},
})