This topic was automatically generated from Slack. You can find the original thread here.
Hey folks, I’m wonder if I can use Pipedream for ETL/ELT jobs (Extract, Transform, Load). What are pros and cons and if it is a good approach. Or should I move to specific solutions such as Amazon Glue, Azure Data Factory, etc.
Hi, it depends on your use case actually. Pipedream has a limit of 10 minutes execution time for the entire workflow, so long-running processes will not fit with Pipedream.
One feature that’s useful is the ability to suspend & resume workflows, which allows Pipedream to orchestrate processes that last much longer than 10 minutes.
So basically a simple example workflow could be:
• Extract some data from an API & upload to S3
• Run a job in Databricks to process the data. Wait for callback to continue.
• Run a job in Snowflake to do something else with the data. Wait for callback to continue.
• Trigger a refresh of the data in Tableau/Power BI.