Small percentage of workflow runs failing (502) but when rerun succeed without a problem

I have a workflow that uses 512MB of memory with a timeout of 30 seconds. None of my workflows take longer than 10 seconds to run to completion, and none of them run out of memory. However, I am getting this random 502 for the last step of my workflow, which is inserting a row into a PostgreSQL via API POST. But when I rerun the action it inserts into the database without any issue, so why did it fail the first time?

Hi @dylburger why is it that I cannot tag my own posts? I can only see 5 tags, the search brings up nothing and I cannot create any

Hi @slader

Are you hosting your own Postgres database on something like AWS or Google Clount, or are you using a 3rd party service like Supabase?

I wonder if it’s possible you’re hitting a connection limit on your instance. How many events per minute is your workflow processing?

Hi Pierce, thanks for your response. Our instance is PostgreSQL on AWS RDS. Our pipedream workflow is being used in Production, so it’s running a few thousand times a day and the database we are writing to has been accepting multiple inserts per minute for a few years now where our workflow/ETL steps were handled in AWS Lambda, so the database instance is capable of the workload it’s getting from Pipedream. I wonder if it could be a limit in Pipedream?

Hi @slader ,

Thanks for the context. The first place to check is within the CloudWatch area of your Postgres instance in AWS.

The specific metric to check I believe is DatabaseConnections. Make sure that the number of simultaneous connections doesn’t exceed what’s available for your instance type. You might see a spike in connections around the time you have these 502 errors.

Are you using a Lambda to expose a REST API endpoint to insert rows into your database?

If so, the error logs there might help diagnose the issue was well.