Never write another web scraper. Diffbot structures information from the web, so you don't have to.
Enrich a person or organization record with partial data input [See the documentation] (https://docs.diffbot.com/reference/enhancepost)
Write Python and use any of the 350k+ PyPi packages available. Refer to the Pipedream Python docs to learn more.
Automatically classify a page and extract data according to its type. See the documentation
The Diffbot API enables you to extract structured data from web pages automatically. It transforms the chaos of the web into usable information through web scraping and natural language processing. On Pipedream, you can use Diffbot to monitor changes on websites, extract article data, or process web pages for specific information. By tapping into Pipedream’s ability to integrate with hundreds of other services, you can create powerful workflows that automate data extraction and act on the data in real-time.
import { axios } from "@pipedream/platform"
export default defineComponent({
props: {
diffbot: {
type: "app",
app: "diffbot",
}
},
async run({steps, $}) {
return await axios($, {
url: `https://api.diffbot.com/v4/account`,
headers: {
"Accept": `application/json`,
},
params: {
token: `${this.diffbot.$auth.api_token}`,
},
})
},
})
Develop, run and deploy your Python code in Pipedream workflows. Integrate seamlessly between no-code steps, with connected accounts, or integrate Data Stores and manipulate files within a workflow.
This includes installing PyPI packages, within your code without having to manage a requirements.txt
file or running pip
.
Below is an example of using Python to access data from the trigger of the workflow, and sharing it with subsequent workflow steps:
def handler(pd: "pipedream"):
# Reference data from previous steps
print(pd.steps["trigger"]["context"]["id"])
# Return data for use in future steps
return {"foo": {"test":True}}