Scrape with JS Rendering with ScrapeNinja API on New Requests from HTTP / Webhook API

Pipedream makes it easy to connect APIs for ScrapeNinja, HTTP / Webhook and 3,000+ other apps remarkably fast.

Trigger workflow on

New Requests from the HTTP / Webhook API

Next, do this

Scrape with JS Rendering with the ScrapeNinja API

No credit card required

▶

Watch us build a workflow

8 min

Watch now ➜

Trusted by 1,000,000+ developers from startups to Fortune 500 companies

Developers ♥ Pipedream

Getting Started#

This integration creates a workflow with a HTTP / Webhook trigger and ScrapeNinja action. When you configure and deploy the workflow, it will run on Pipedream's servers 24x7 for free.

Select this integration
Configure the New Requests trigger
1. Optional- Configure Body Only
2. Optional- Configure Response Status Code
3. Optional- Configure Response Content-Type
4. Optional- Configure Response Body
5. Connect your HTTP / Webhook account
Configure the Scrape with JS Rendering action
1. Connect your ScrapeNinja account
2. Configure URL
3. Optional- Configure Wait For Selector
4. Optional- Configure Post Wait Time
5. Optional- Configure Dump Iframe
6. Optional- Configure Wait For Selector Iframe
7. Optional- Configure Extractor Target Iframe
8. Optional- Configure Headers
9. Optional- Configure Retry Number
10. Optional- Configure Geo
11. Optional- Configure Proxy
12. Optional- Configure Timeout
13. Optional- Configure Text Not Expected
14. Optional- Configure Status Not Expected
15. Optional- Configure Block Images
16. Optional- Configure Block Media
17. Optional- Configure Screenshot
18. Optional- Configure Catch Ajax Headers URL Mask
19. Optional- Configure Viewport Width
20. Optional- Configure Viewport Height
21. Optional- Configure Viewport Device Scale Factor
22. Optional- Configure Viewport Has Touch
23. Optional- Configure Viewport Is Mobile
24. Optional- Configure Viewport Is Landscape
25. Optional- Configure Extractor
Deploy the workflow
Send a test event to validate your setup
Turn on the trigger

Details#

This integration uses pre-built, source-available components from Pipedream's GitHub repo. These components are developed by Pipedream and the community, and verified and maintained by Pipedream.

To contribute an update to an existing component or create a new component, create a PR on GitHub. If you're new to Pipedream component development, you can start with quickstarts for trigger span and action development, and then review the component API reference.

Trigger#

New Requests on HTTP / Webhook

Description:Get a URL and emit the full HTTP event on every request (including headers and query parameters). You can also configure the HTTP response code, body, and more.

Version:0.1.1

Key:http-new-requests

View on GitHub

HTTP / Webhook Overview#

Build, test, and send HTTP requests without code using your Pipedream workflows. The HTTP / Webhook action is a tool to build HTTP requests with a Postman-like graphical interface.

Point and click HTTP requests

Define the target URL, HTTP verb, headers, query parameters, and payload body without writing custom code.

$A screenshot of Pipedream's HTTP Request Configuration interface with a GET request type selected. The request URL is set to 'https://api.openai.com/v1/models'. The 'Auth' tab is highlighted, indicating that authentication is required for this request. In the headers section, there are two headers configured: 'User-Agent' is set to 'pipedream/1', and 'Authorization' is set to 'Bearer {{openai_api_key}}', showing how the OpenAI account's API key is dynamically inserted into the headers to handle authentication automatically.$

Here's an example workflow that uses the HTTP / Webhook action to send an authenticated API request to OpenAI.

Focus on integrating, not authenticating

This action can also use your connected accounts with third-party APIs. Selecting an integrated app will automatically update the request’s headers to authenticate with the app properly, and even inject your token dynamically.

Pipedream integrates with thousands of APIs, but if you can’t find a Pipedream integration simply use Environment Variables in your request headers to authenticate with.

Compatible with no code actions or Node.js and Python

The HTTP/Webhook action exports HTTP response data for use in subsequent workflow steps, enabling easy data transformation, further API calls, database storage, and more.

Response data is available for both coded (Node.js, Python) and no-code steps within your workflow.

Trigger Code#

import http from "../../http.app.mjs";

// Core HTTP component
export default {
  key: "http-new-requests",
  name: "New Requests",
  description: "Get a URL and emit the full HTTP event on every request (including headers and query parameters). You can also configure the HTTP response code, body, and more.",
  version: "0.1.1",
  type: "source",
  props: {
    httpInterface: {
      type: "$.interface.http",
      customResponse: true,
    },
    emitBodyOnly: {
      type: "boolean",
      label: "Body Only",
      description: "This source emits an event representing the full HTTP request by default. Select `true` to emit the body only.",
      optional: true,
      default: false,
    },
    resStatusCode: {
      type: "string",
      label: "Response Status Code",
      description: "The status code to return in the HTTP response",
      optional: true,
      default: "200",
    },
    resContentType: {
      type: "string",
      label: "Response Content-Type",
      description: "The `Content-Type` of the body returned in the HTTP response",
      optional: true,
      default: "application/json",
    },
    resBody: {
      type: "string",
      label: "Response Body",
      description: "The body to return in the HTTP response",
      optional: true,
      default: "{ \"success\": true }",
    },
    http,
  },
  async run(event) {
    const summary = `${event.method} ${event.path}`;

    this.httpInterface.respond({
      status: this.resStatusCode,
      body: this.resBody,
      headers: {
        "content-type": this.resContentType,
      },
    });

    if (this.emitBodyOnly) {
      this.$emit(event.body, {
        summary,
      });
    } else {
      this.$emit(event, {
        summary,
      });
    }
  },
};

Trigger Configuration#

This component may be configured based on the props defined in the component code. Pipedream automatically prompts for input values in the UI and CLI.

Label	Prop	Type	Description
N/A	`httpInterface`	`$.interface.http`	This component uses `$.interface.http` to generate a unique URL when the component is first instantiated. Each request to the URL will trigger the `run()` method of the component.
Body Only	`emitBodyOnly`	`boolean`	This source emits an event representing the full HTTP request by default. Select `true` to emit the body only.
Response Status Code	`resStatusCode`	`string`	The status code to return in the HTTP response
Response Content-Type	`resContentType`	`string`	The `Content-Type` of the body returned in the HTTP response
Response Body	`resBody`	`string`	The body to return in the HTTP response
HTTP / Webhook	`http`	`app`	This component uses the HTTP / Webhook app.

Trigger Authentication#

The HTTP / Webhook API does not require authentication.

About HTTP / Webhook#

Get a unique URL where you can send HTTP or webhook requests

Action#

Scrape with JS Rendering on ScrapeNinja

Description:Uses the ScrapeNinja real Chrome browser engine to scrape pages that require JS rendering. [See the documentation](https://scrapeninja.net/docs/api-reference/scrape-js/)

Version:0.0.2

Key:scrapeninja-scrape-with-js-rendering

View on GitHub

ScrapeNinja Overview#

ScrapeNinja API on Pipedream allows you to craft powerful serverless workflows for web scraping without the hassle of managing proxies or browsers. It's a tool that can extract data from websites, handling JavaScript rendering and anti-bot measures with ease. By integrating ScrapeNinja with Pipedream, you can automate data collection, collate and process the scraped data, and connect it to numerous other services for further analysis, alerting, or storage.

Action Code#

import { ConfigurationError } from "@pipedream/platform";
import {
  clearObj,
  parseError, parseObject,
} from "../../common/utils.mjs";
import scrapeninja from "../../scrapeninja.app.mjs";

export default {
  key: "scrapeninja-scrape-with-js-rendering",
  name: "Scrape with JS Rendering",
  description: "Uses the ScrapeNinja real Chrome browser engine to scrape pages that require JS rendering. [See the documentation](https://scrapeninja.net/docs/api-reference/scrape-js/)",
  version: "0.0.2",
  annotations: {
    destructiveHint: false,
    openWorldHint: true,
    readOnlyHint: false,
  },
  type: "action",
  props: {
    scrapeninja,
    url: {
      propDefinition: [
        scrapeninja,
        "url",
      ],
    },
    waitForSelector: {
      propDefinition: [
        scrapeninja,
        "waitForSelector",
      ],
      optional: true,
    },
    postWaitTime: {
      propDefinition: [
        scrapeninja,
        "postWaitTime",
      ],
      optional: true,
    },
    dumpIframe: {
      propDefinition: [
        scrapeninja,
        "dumpIframe",
      ],
      optional: true,
    },
    waitForSelectorIframe: {
      propDefinition: [
        scrapeninja,
        "waitForSelectorIframe",
      ],
      optional: true,
    },
    extractorTargetIframe: {
      propDefinition: [
        scrapeninja,
        "extractorTargetIframe",
      ],
      optional: true,
    },
    headers: {
      propDefinition: [
        scrapeninja,
        "headers",
      ],
      optional: true,
    },
    retryNum: {
      propDefinition: [
        scrapeninja,
        "retryNum",
      ],
      optional: true,
    },
    geo: {
      propDefinition: [
        scrapeninja,
        "geo",
      ],
      optional: true,
    },
    proxy: {
      propDefinition: [
        scrapeninja,
        "proxy",
      ],
      optional: true,
    },
    timeout: {
      propDefinition: [
        scrapeninja,
        "timeout",
      ],
      optional: true,
    },
    textNotExpected: {
      propDefinition: [
        scrapeninja,
        "textNotExpected",
      ],
      optional: true,
    },
    statusNotExpected: {
      propDefinition: [
        scrapeninja,
        "statusNotExpected",
      ],
      optional: true,
    },
    blockImages: {
      propDefinition: [
        scrapeninja,
        "blockImages",
      ],
      optional: true,
    },
    blockMedia: {
      propDefinition: [
        scrapeninja,
        "blockMedia",
      ],
      optional: true,
    },
    screenshot: {
      propDefinition: [
        scrapeninja,
        "screenshot",
      ],
      optional: true,
    },
    catchAjaxHeadersUrlMask: {
      propDefinition: [
        scrapeninja,
        "catchAjaxHeadersUrlMask",
      ],
      optional: true,
    },
    viewportWidth: {
      propDefinition: [
        scrapeninja,
        "viewportWidth",
      ],
      optional: true,
    },
    viewportHeight: {
      propDefinition: [
        scrapeninja,
        "viewportHeight",
      ],
      optional: true,
    },
    viewportDeviceScaleFactor: {
      propDefinition: [
        scrapeninja,
        "viewportDeviceScaleFactor",
      ],
      optional: true,
    },
    viewportHasTouch: {
      propDefinition: [
        scrapeninja,
        "viewportHasTouch",
      ],
      optional: true,
    },
    viewportIsMobile: {
      propDefinition: [
        scrapeninja,
        "viewportIsMobile",
      ],
      optional: true,
    },
    viewportIsLandscape: {
      propDefinition: [
        scrapeninja,
        "viewportIsLandscape",
      ],
      optional: true,
    },
    extractor: {
      propDefinition: [
        scrapeninja,
        "extractor",
      ],
      optional: true,
    },
  },
  async run({ $ }) {
    try {
      const viewport = clearObj({
        width: this.viewportWidth,
        height: this.viewportHeight,
        deviceScaleFactor: this.viewportDeviceScaleFactor,
        hasTouch: this.viewportHasTouch,
        isMobile: this.viewportIsMobile,
        isLandscape: this.viewportIsLandscape,
      });

      const data = clearObj({
        url: this.url,
        waitForSelector: this.waitForSelector,
        postWaitTime: this.postWaitTime,
        dumpIframe: this.dumpIframe,
        waitForSelectorIframe: this.waitForSelectorIframe,
        extractorTargetIframe: this.extractorTargetIframe,
        headers: parseObject(this.headers),
        retryNum: this.retryNum,
        geo: this.geo,
        proxy: this.proxy,
        timeout: this.timeout,
        textNotExpected: parseObject(this.textNotExpected),
        statusNotExpected: parseObject(this.statusNotExpected),
        blockImages: this.blockImages,
        blockMedia: this.blockMedia,
        screenshot: this.screenshot,
        catchAjaxHeadersUrlMask: this.catchAjaxHeadersUrlMask,
        extractor: this.extractor,
      });

      if (Object.entries(viewport).length) {
        data.viewport = viewport;
      }

      const response = await this.scrapeninja.scrapeJs({
        $,
        data,
      });

      $.export("$summary", `Successfully scraped ${this.url} with JS rendering`);
      return response;
    } catch ({ response: { data } }) {
      throw new ConfigurationError(parseError(data));
    }
  },
};

Action Configuration#

This component may be configured based on the props defined in the component code. Pipedream automatically prompts for input values in the UI.

Label	Prop	Type	Description
ScrapeNinja	`scrapeninja`	`app`	This component uses the ScrapeNinja app.
URL	`url`	`string`	The URL to scrape.
Wait For Selector	`waitForSelector`	`string`	CSS selector to wait for before considering the page loaded.
Post Wait Time	`postWaitTime`	`integer`	Wait for specified amount of seconds after page load (from 1 to 12s). Use this only if ScrapeNinja failed to wait for required page elements automatically.
Dump Iframe	`dumpIframe`	`string`	If some particular iframe needs to be dumped, specify its name HTML value in this argument. The ScrapeNinja JS renderer will wait for CSS selector to wait for iframe DOM elements to appear inside.
Wait For Selector Iframe	`waitForSelectorIframe`	`string`	If `Dump Iframe` is activated, this property allows to wait for CSS selector inside this iframe.
Extractor Target Iframe	`extractorTargetIframe`	`boolean`	If `Dump Iframe` is activated, this property allows to run JS extractor function against iframe HTML instead of running it against base body. This is only useful if `Dump Iframe` is activated.
Headers	`headers`	`string[]`	Custom headers to send with the request. By default, regular Chrome browser headers are sent to the target URL.
Retry Number	`retryNum`	`integer`	Amount of attempts.
Geo	`geo`	`string`	Geo location for basic proxy pools (you can purchase premium ScrapeNinja proxies for wider country selection and higher proxy quality). Read more about ScrapeNinja proxy setup
Proxy	`proxy`	`string`	Premium or your own proxy URL (overrides `Geo` prop). Read more about ScrapeNinja proxy setup
Timeout	`timeout`	`integer`	Timeout per attempt, in seconds. Each retry will take [timeout] number of seconds.
Text Not Expected	`textNotExpected`	`string[]`	Text which will trigger a retry from another proxy address.
Status Not Expected	`statusNotExpected`	`integer[]`	HTTP response statuses which will trigger a retry from another proxy address.
Block Images	`blockImages`	`boolean`	Block images from loading. This will speed up page loading and reduce bandwidth usage.
Block Media	`blockMedia`	`boolean`	Block (CSS, fonts) from loading. This will speed up page loading and reduce bandwidth usage.
Screenshot	`screenshot`	`boolean`	Take a screenshot of the page. Pass "false" to increase the speed of the request.
Catch Ajax Headers URL Mask	`catchAjaxHeadersUrlMask`	`string`	Useful to dump some XHR response. Pass URL mask here. For example, if you need to catch all requests to https://example.com/api/data.json, pass "api/data.json" here. In response, you will get new property `.info.catchedAjax` with the XHR response data - { url, method, headers[], body , status, responseHeaders{} }
Viewport Width	`viewportWidth`	`integer`	Width of the viewport.
Viewport Height	`viewportHeight`	`integer`	Height of the viewport.
Viewport Device Scale Factor	`viewportDeviceScaleFactor`	`integer`	Device scale factor for the viewport.
Viewport Has Touch	`viewportHasTouch`	`boolean`	Whether the viewport has touch capabilities.
Viewport Is Mobile	`viewportIsMobile`	`boolean`	Whether the viewport is mobile.
Viewport Is Landscape	`viewportIsLandscape`	`boolean`	Whether the viewport is in landscape mode.
Extractor	`extractor`	`string`	Custom JS function to extract JSON values from scraped HTML. Write&test your own extractor on https://scrapeninja.net/cheerio-sandbox/

Action Authentication#

ScrapeNinja uses API keys for authentication. When you connect your ScrapeNinja account, Pipedream securely stores the keys so you can easily authenticate to ScrapeNinja APIs in both code and no-code steps.

Using ScrapingNinja in Pipedream

Create a RapidAPI Account: Begin by signing up for a RapidAPI account.
Access Your API Key:
- Once registered, you'll be able to interact with ScrapingNinja using your RapidAPI key.
- Open the ScrapingNinja documentation on RapidAPI and locate your API key labeled X-RapidAPI-Key.
- Copy this key and paste it into the rapid_api_key field below.
Subscribe to the API: Finally, click Subscribe to Test in the RapidAPI console to subscribe to the ScrapingNinja API.

About ScrapeNinja#

Extract Web Data on Scale

More Ways to Connect ScrapeNinja + HTTP / Webhook#

Other Popular Integrations#

Scrape with JS Rendering with ScrapeNinja API on New Requests (Payload Only) from HTTP / Webhook API

HTTP / Webhook + ScrapeNinja

Try it

Scrape without JS with ScrapeNinja API on New Requests (Payload Only) from HTTP / Webhook API

HTTP / Webhook + ScrapeNinja

Try it

Scrape without JS with ScrapeNinja API on New Requests from HTTP / Webhook API

HTTP / Webhook + ScrapeNinja

Try it

Scrape with JS Rendering with ScrapeNinja API on New event when the content of the URL changes. from HTTP / Webhook API

HTTP / Webhook + ScrapeNinja

Try it

Scrape without JS with ScrapeNinja API on New event when the content of the URL changes. from HTTP / Webhook API

HTTP / Webhook + ScrapeNinja

Try it

Popular Triggers#

New Requests from the HTTP / Webhook API

Get a URL and emit the full HTTP event on every request (including headers and query parameters). You can also configure the HTTP response code, body, and more.

Try it

New Requests (Payload Only) from the HTTP / Webhook API

Get a URL and emit the HTTP body as an event on every request

Try it

New event when the content of the URL changes. from the HTTP / Webhook API

Emit new event when the content of the URL changes.

Try it