General questions on concurrency / throttling

This topic was automatically generated from Slack. You can find the original thread here.

Daniel Chua : Hello –I have an Airtable CRON trigger that checks in every 30 mins and provides records updates as emitted events. These events then trigger a workflow.
I seem to be facing some issues with concurrency and throttling to make this work. What I’m trying to ensure is that where there are say 500 rows that are changed in that 30 mins, the workflow is triggered 500 times. To do that, would I be correct in setting Throttle to Unlimited and Limit Concurrency to say 50? I’m on the Professional plan so should have a 10,000 queue limit.
Separately, would I be correct in that the 10,000 queue limit applies to the number of emitted events – i.e. number of Airtable rows that were created/updated/deleted each 30-min cron cycle? It does not apply to the “batches of 50” concurrent events being processed, right?
Thanks in advance!

Dylan Sather (Pipedream) : just to confirm, are you trying to set a concurrency limit because you’re dealing with API rate limits in a downstream service?

Since you’re on the Pro plan, you can increase any workflow’s queue size to 10,000, but keep in mind that the default is still 100. You can increase the queue size in your workflow’s Settings tab.

Each emitted event from an event source counts towards that limit, so in your example, 500 events would get added to the queue and processed given the rules you’ve set in the Throttling / Concurrency section. So in your case, a max of 50 workers will process those 500 events at any given time. As soon as workers (a workflow processing a single event) finish execution, we’ll pop another event off the event queue and start processing that.

If you’re working with a downstream rate limit of e.g. 50 requests per second, I’d recommend setting the Throttling accordingly (50 events per second). With a concurrency of 50 events, that just means that we’ll process at most 50 events at any given time (if each event only took 0.5 seconds to process, that means we’d process roughly 100 events per second). The throttling affords slightly better control when you’re trying to process events at a particular rate, if that makes sense.

Daniel Chua : I see, thanks for that in-depth explanation! But for throttling, the limit is only 100 events in queue at any one time, even on the professional plan? What seems to be happening is that when 500 events are dropped into queue, only the 1st 100 are eventually processed and the rest lost

Dylan Sather (Pipedream) : What is the queue size in your workflow’s Settings set to? Did you try increasing that?

Daniel Chua : I am only able to set a queue size under Limit concurrency. Under Throttle workflow execution, there is no queue size option.

Am I supposed to turn both Limit concurrency and Throttle workflow execution options on? Meaning to say, the Queue Size option under Throttle workflow execution actually affects the Limit concurrency settings too?

Dylan Sather (Pipedream) : I understand why the indentation here might have been confusing, but the queue size applies to all events, and isn’t coupled to the concurrency section. Setting that to 10,000 will let you throttle up to 10,000 queued events, regardless of whether concurrency is enabled.

Daniel Chua : Thanks for the clarification!

Apologies on resurrecting this topic, just looking for some additional clarity.

So, if I have a workflow that pings an API downstream that limits me to 100 calls every 10 seconds, then this is configuration I want?

image

And it would not matter if the workflow made 1 API call, or 2 API calls (per execution of the workflow)…these settings would force Pipedream to only run through each step at a max of 10 per second before pushing additional calls into the queue?

I just want to be sure that two different workflows…one that hits the API once…and a second workflow that hits the API four times, say…are both correctly throttled.

Hi @shawn-wm

That’s correct. You’ve throttled the workflow’s execution to only once every 10 seconds. And you’ve additionally limited the number of concurrent workflows to one.

The events queue will still accept events, but the actual processing of them has been limited by these settings.

If the rate limiting isn’t at the per second granularity, you could remove the concurrency limitation, and the workflow would still only process 10 events per second.

It would allow 10 instances of your workflow to spawn and execute, if you had a burst of traffic. But it won’t allow 11.

I hope this helps!

It does

The challenge now is that the workflows themselves can be limited to concurrent executions, loading up into a queue when the threshold is breached…

…but if I have different workflows that all ping the same API (that is rate limited) I don’t really have a good way of treating them all in a global queue.

I’m imagining a “folder” like structure in the Pipedream interface, in which I store (let’s say) 12 workflows, and at that folder that contains all 12 of those workflows, I see a similar setting to the screenshot in the earlier post, that basically lets me group all of the workflows together in a single concurrency queue, telling Pipedream "the rate limiting set at this folder applies to all the workflows in this folder…so if WF01 and WF02 are both called, WF02 waits before WF01 finishes (or its individual actions satisfy the queue)