Why is Python Code Failing in Production but Succeeding in Manual Tests, and How Can It Be Debugged?

Running into a number of event failures on the python code step with the error: “Error
Could not execute step

Error: Could not execute step
    at Object.execute (file:///var/task/nano_worker.mjs:166:28)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Runner.processCell (file:///var/task/lambda_handler.mjs:773:16)
    at async Runner.runUserCode (file:///var/task/lambda_handler.mjs:914:9)
    at async Runner.run (file:///var/task/lambda_handler.mjs:745:5)
    at async Runtime.handler (file:///var/task/lambda_handler.mjs:967:22)

in production. However, when I rebuild from event and manually test that step - it always succeeds. The time execution is well within the limits I set as well. Any idea on what’s going on, and how to debug it if it’s something I can do to fix it?

Hi , could you please go to your workflow editor, make a trivial change to each Python step (i.e. add in a comment), then make a new deploy and see if the error is resolved?