Octoparse

Easy Web Scraping for Anyone. No code is the best code. Octoparse allows everyone to build reliable web scrapers they need - no coding needed.

Integrate the Octoparse API with the Python API

Setup the Octoparse API trigger to run a workflow which integrates with the Python API. Pipedream's integration platform allows you to integrate Octoparse and Python remarkably fast. Free for developers.

Run Python Code with the Python API

Write Python and use any of the 350k+ PyPi packages available. Refer to the Pipedream Python docs to learn more.

 
Try it

Overview of Octoparse

The Octoparse API allows you to automate the extraction of web data without coding, making it a powerful tool for data-driven workflows. With this API, you can control your scraping tasks, retrieve extracted data, and manage your account programmatically. When combined with Pipedream's serverless execution environment, you can build custom workflows to process, store, or act upon the data fetched by Octoparse. This integration can be a cornerstone for solutions in market research, competitor analysis, price monitoring, or lead generation.

Connect Octoparse

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import { axios } from "@pipedream/platform"
export default defineComponent({
  props: {
    octoparse: {
      type: "app",
      app: "octoparse",
    }
  },
  async run({steps, $}) {
    return await axios($, {
      url: `https://openapi.octoparse.com/taskGroup`,
      headers: {
        Authorization: `Bearer ${this.octoparse.$auth.oauth_access_token}`,
      },
    })
  },
})

Overview of Python

Develop, run and deploy your Python code in Pipedream workflows. Integrate seamlessly between no-code steps, with connected accounts, or integrate Data Stores and manipulate files within a workflow.

This includes installing PyPI packages, within your code without having to manage a requirements.txt file or running pip.

Below is an example of using Python to access data from the trigger of the workflow, and sharing it with subsequent workflow steps:

Connect Python

1
2
3
4
5
def handler(pd: "pipedream"):
  # Reference data from previous steps
  print(pd.steps["trigger"]["context"]["id"])
  # Return data for use in future steps
  return {"foo": {"test":True}}