How to Centralize and Manage Shared Data in a Pipedream Workflow?

This topic was automatically generated from Slack. You can find the original thread here.

I’m building a Pipedream workflow where each step processes data and passes it to the next.

Example:

  1. Step 1: Remove headlines from the data.
  2. Step 2: Process updated data from Step 1
  3. Step 3: Work on data updated in Step 2
  4. And so on
    BUT I would like to centralize this data so I can:
    • Modify step order easily.
    • Access/update shared data independently.
    Is there a built-in way to manage shared data in a workflow, that is notes data stores?

Thanks! :blush:

Hi, I’ll pass this question to our bot Pi to see if it helps

I’m building a Pipedream workflow where each step processes data and passes it to the next.

Example:

  1. Step 1: Remove headlines from the data.
  2. Step 2: Process updated data from Step 1
  3. Step 3: Work on data updated in Step 2
  4. And so on
    BUT I would like to centralize this data so I can:
    • Modify step order easily.
    • Access/update shared data independently.
    Is there a built-in way to manage shared data in a workflow, that is notes data stores?

Thanks! :blush:

To manage shared data in a Pipedream workflow, you can use Pipedream’s built-in key-value data stores. These data stores allow you to store and retrieve data across workflow executions and different workflows, making it easier to centralize and manage shared data. Here’s how you can use data stores in your workflow: 1. Store Data: You can store data in a data store at any point in your workflow. This allows you to save the output of one step and access it in another step, even if the steps are reordered. 2. Retrieve Data: You can retrieve data from the data store whenever needed. This makes it easy to access shared data across different steps or workflows. 3. Modify Step Order: Since the data is centralized in the data store, you can easily modify the order of steps without worrying about data dependencies between them. Here’s an example of how you might use a data store in a Node.js code step: ```
export default defineComponent({ props: { data: { type: “data_store” }, }, async run({ steps, $ }) { // Store data await this.data.set(‘step1Output’, ‘Processed data from Step 1’); // Retrieve data const step1Data = await this.data.get(‘step1Output’); console.log(step1Data); // Modify data await this.data.set(‘step1Output’, ‘Updated data from Step 1’); }, });

I dont want the data stores

I think the most suitable built-in feature to support your usecase is datastore

Makes no sense to place there for a run → it just fills up my data store

Yep, then I guess there’s no other built-in solution better than the datastore

Makes no sense to place there for a run → it just fills up my data store
You can just remove the record from datastore as the last steps in your workflow

Fair enough → I will try :slightly_smiling_face:

But this should be a common case. Maybe a feature request in the future

And then I save it in each step

the credit and time will increase many times

Just to clarify, using multiple steps will not affect much the processing time (and credit), as Pipedream will run your whole workflow as a single computation unit.

But I need to load + update the datastore in each step right?

Yeah, but it’s not guarantee that the credit (and time) will increase multiple times.

On another note, you can just structure your workflow in a way that don’t need datastore at all. Maybe it will solve your whole problem from the begining.

You could just use the /tmp folder.

Just read/write a /tmp/data.json file there in each step.

But I can see how a “shared memory” feature could be useful to have across all of the steps.