Is it Reasonable to Set the Workflow Memory to 25-30% More than the Maximum Memory Usage Identified from Monitoring?

This topic was automatically generated from Slack. You can find the original thread here.

I have built memory usage monitoring into my Node.js and Python steps, trying to identify a reasonable max amount of memory to set. I’m basically checking the memory usage a few times during the step, hanging onto the highest values for external, heap used, and rss, and then recording those to my telemetry table. I can’t do this with the Snowflake Execute SQL Query actions, of course. Pi told me that if I set the workflow memory to 25-30% more than that it should be sufficient. Is that a reasonable method? In my last test, the highest total I got was 1534MB, so would 2048 be reasonable (that’s actually just over 33% buffer)?

You’re way ahead of most people already.

All I’ve heard here is “if your workflow runs out of memory, increase it”.

You could always add an extra step after the final one just to output the final numbers (that would show you roughly how much memory the “Snowflake Execute SQL Query” step is using).

It doesn’t seem to go up throughout the course of the workflow, unless I’ve done something wrong. I have two steps in js, one is getting the data from the external system, the next is converting the data to CSV(s) and writing to /tmp, and then a Python step to upload the file to Snowflake using the Snowflake python connector and the PUT command. The total usage of those steps were 1296, 1727, and 572. I suppose I could put another step just to look at memory at the very end, and log it to the console, but it seems as if that’s going to be pretty low.

Hmmm, interesting.

I don’t actually know how memory management works since there is no data available out of the box.

I always assumed it goes up throughout the workflow, but it’s interesting that it can also go down… :thinking_face:

In general, more steps = more memory usage… so I’m curious as to how that relates with your findings. :face_with_monocle:

If memory can go down during the workflow, then adding more steps should not matter…

Unless there are two levels of memory: each step has its own, but the workflow as a container of steps also has an overall memory for the entire thing.

And maybe even shared memory for multiple consecutive runs, just like Lambda (which is the underlying engine).

Would be nice to get more info from Pipedream, but they have never been very helpful/transparent in this area.

Yeah, I ran this as the final step. When I just tested it by itself, it came to 66MB. Interesting… when I ran it at the end of the full workflow, instead of locally testing it, 1673MB.

export default defineComponent({
async run({ steps, $ }) {
// Reference previous step data using the steps object and return data to use it in future steps
console.log(process.memoryUsage())
return steps.trigger.event
},
})

Yeah, I think the builder handles memory differently than deployed workflows.

It seems to flush/reset itself often, because we often lose files in /tmp while testing (but never for deployed workflows).

True.

I need to see how it does when it is running on its own.

It would certainly be helpful if we could still see all the telemetry as of the point that the memory ran out, so that we can see what step it’s on, at least…

Yeah, right now there’s so much guesswork involved. :disappointed: