How to Centralize and Manage Shared Data in a Pipedream Workflow?

user-1 · January 28, 2025, 4:47pm

This topic was automatically generated from Slack. You can find the original thread here.

I’m building a Pipedream workflow where each step processes data and passes it to the next.

Example:

Step 1: Remove headlines from the data.
Step 2: Process updated data from Step 1
Step 3: Work on data updated in Step 2
And so on
BUT I would like to centralize this data so I can:
• Modify step order easily.
• Access/update shared data independently.
Is there a built-in way to manage shared data in a workflow, that is notes data stores?

Thanks!

user-1 · January 28, 2025, 4:47pm

Hi, I’ll pass this question to our bot Pi to see if it helps

I’m building a Pipedream workflow where each step processes data and passes it to the next.

Example:

Step 1: Remove headlines from the data.
Step 2: Process updated data from Step 1
Step 3: Work on data updated in Step 2
And so on
BUT I would like to centralize this data so I can:
• Modify step order easily.
• Access/update shared data independently.
Is there a built-in way to manage shared data in a workflow, that is notes data stores?

Thanks!

user-1 · January 28, 2025, 4:47pm

To manage shared data in a Pipedream workflow, you can use Pipedream’s built-in key-value data stores. These data stores allow you to store and retrieve data across workflow executions and different workflows, making it easier to centralize and manage shared data. Here’s how you can use data stores in your workflow: 1. Store Data: You can store data in a data store at any point in your workflow. This allows you to save the output of one step and access it in another step, even if the steps are reordered. 2. Retrieve Data: You can retrieve data from the data store whenever needed. This makes it easy to access shared data across different steps or workflows. 3. Modify Step Order: Since the data is centralized in the data store, you can easily modify the order of steps without worrying about data dependencies between them. Here’s an example of how you might use a data store in a Node.js code step: ```
export default defineComponent({ props: { data: { type: “data_store” }, }, async run({ steps, $ }) { // Store data await this.data.set(‘step1Output’, ‘Processed data from Step 1’); // Retrieve data const step1Data = await this.data.get(‘step1Output’); console.log(step1Data); // Modify data await this.data.set(‘step1Output’, ‘Updated data from Step 1’); }, });

user-1 · January 28, 2025, 4:47pm

I dont want the data stores

user-1 · January 28, 2025, 4:47pm

I think the most suitable built-in feature to support your usecase is datastore

user-1 · January 28, 2025, 4:47pm

Makes no sense to place there for a run → it just fills up my data store

user-1 · January 28, 2025, 4:47pm

Yep, then I guess there’s no other built-in solution better than the datastore

user-1 · January 28, 2025, 4:47pm

Makes no sense to place there for a run → it just fills up my data store
You can just remove the record from datastore as the last steps in your workflow

user-1 · January 28, 2025, 4:47pm

Fair enough → I will try

But this should be a common case. Maybe a feature request in the future

user-1 · January 28, 2025, 4:47pm

And then I save it in each step

user-1 · January 28, 2025, 4:47pm

the credit and time will increase many times

user-1 · January 28, 2025, 4:47pm

Just to clarify, using multiple steps will not affect much the processing time (and credit), as Pipedream will run your whole workflow as a single computation unit.

user-1 · January 28, 2025, 4:47pm

But I need to load + update the datastore in each step right?

user-1 · January 28, 2025, 4:47pm

Yeah, but it’s not guarantee that the credit (and time) will increase multiple times.

On another note, you can just structure your workflow in a way that don’t need datastore at all. Maybe it will solve your whole problem from the begining.

user-1 · January 28, 2025, 4:47pm

You could just use the /tmp folder.

user-1 · January 28, 2025, 4:47pm

Just read/write a /tmp/data.json file there in each step.

user-1 · January 28, 2025, 4:47pm

But I can see how a “shared memory” feature could be useful to have across all of the steps.