An incredibly powerful web scraper.
Write Python and use any of the 350k+ PyPi packages available. Refer to the Pipedream Python docs to learn more.
The ParseHub API allows you to leverage the power of web scraping directly within Pipedream. By integrating ParseHub, you can automate the collection of data from web pages, manipulate and transform it with Pipedream’s built-in code steps or pre-built actions, and connect it to hundreds of other apps. You can extract structured data from any website, run scraping jobs, retrieve results and integrate with other services for data processing, visualization, or storage.
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
parsehub: {
type: "app",
app: "parsehub",
}
},
async run({steps, $}) {
return await axios($, {
url: `https://www.parsehub.com/api/v2/projects`,
params: {
api_key: `${this.parsehub.$auth.api_key}`,
},
})
},
})
Develop, run and deploy your Python code in Pipedream workflows. Integrate seamlessly between no-code steps, with connected accounts, or integrate Data Stores and manipulate files within a workflow.
This includes installing PyPI packages, within your code without having to manage a requirements.txt
file or running pip
.
Below is an example of using Python to access data from the trigger of the workflow, and sharing it with subsequent workflow steps:
def handler(pd: "pipedream"):
# Reference data from previous steps
print(pd.steps["trigger"]["context"]["id"])
# Return data for use in future steps
return {"foo": {"test":True}}