How to Optimize Heavy Workflow Strategy with Large Data Sources and API Limitations?

This topic was automatically generated from Slack. You can find the original thread here.

Hello! I need advice on strategy to process heavy workflow. For context, I’m trying to build a workflow to get a summary data based on a set of item id’s passed into the workflow. The workflow then proceed to get the data from a source (ClickUp) and process it before returning the output.

Few challenges:
• The item id passed as input need to fetch additional info from the sources (eg. linked by Property ID)
• There are 2 sources I need to fetch eg. Item & Property
• The sources API don’t have the ability to filter the ID, hence I need to fetch the entire source database which can be huge (>10,000 records)
• From there the workflow will find and fill in the data from sources to the items.
• Return the items as output take quite long for first run (77s).
Some workaround I tried and using now:
• Use Data Store to cache the output if there are recent request in short time
Workflow:
• Trigger (HTTP) - to get item id input
• Data Store - check if there is valid cache and return and end the workflow
• Get ClickUp List Data (1) - From given item id input
• Get ClickUp List Data (2) - All of Property data source
• Combine & Respond - Fill (1) with additional info from (2), then respond back (at the same time caching it to data store also)
I’m a bit worry on how this can scale well, the source records are going to scale from here onwards probably to >20,000 records soon. As it is even the workflow editor when debugging slows to a crawl and crash the browser tab (probably run out of memory).

I appreciate any advice on how to approach this :pray:

Hi , I see the Clickup API - Get Tasks API has this Custom Task IDs here that Pipedream can add into our existing Clickup Get Tasks action to help you. May I ask if all of your tasks have a Custom ID?

Nope, we don’t use custom task id and its usually null.

image.png

, then I would suggest you to add some custom fields into your tasks in order to query them effectively. The custom field might be the same as the id field

This is a limitation on Clickup, which Pipedream doesn’t have the control of

Yup understand the limitation of ClickUp hence the question here. Ok I will look into the custom field and see if I can pull the task id using formula and see.

the custom_task_ids is just a boolean here so can’t help much but found a undocumented way to filter by task ids in case anyone is looking into this as well in ClickUp. :expressionless: