Data Store Confusion: Is it Available in Python and How to Access Stored Data?

user-1 · March 31, 2023, 12:57am

This topic was automatically generated from Slack. You can find the original thread here.

Hi - I’m trying out Data Store, using Python. I recognize it is in preview, so not sure what is known or expected, though I thought I’d provided feedback none the less.

I first read the overall Data Store documentation. At the bottom of this entry it indicated Data Store was only available in Node.js. Data Stores

However, when I click to Code\Python I found a “how to use Data Stores” topic. I was delighted! Using Data Stores. All the examples are there for Python. I thought you did a nice job outlining things. Again at the bottom indicating it was only available in Node.js. So I was confused as all the documentation above this statement sure did look like it existed.

So I went ahead and coded up my example, and after fixing some errors of mine, the code block in my flow ran fine. So obviously available in Python. I got confused though. I went to the data store screen, and I saw what I attached. I thought maybe I would see some data or entry in this window that expands under the data store. I clicked around and did some more code running. Head scratching. After some time I clicked on the actual name of the data store and then saw the data that was recorded. So, this wasn’t obvious, to me at least, at first. I’m not sure what the window under data store is for.

Finally, I thought maybe I would “return” the data store. Though not needed for another step as it is a data store, I wanted to at least see the output for debugging. doing “return data_store” didn’t actually create an Export entry. I also tried to simply print to log “Print(data_store)” and that creates an object entry like “<pipedream.pipedream.Store object at 0x7ff163aeaf10>”. So, not a nicely format of what was stored. Perhaps your documentation could offer an example of what could be done to get some output in the log or return.

So far I’m glad I can use Data Store to create a more cohesive storage for my workflow. Thank you for working on this feature and hopefully this feedback is helpful in furthering your design on this. My particular use case right now is for data storage to use during the workflow, not something that has to exist across work flows. So a variation on data store would be a temporary data store that exists just while an instance of a workflow is running. Again, for a singular spot, so I don’t have to reference steps as while I’m developing I’ve changed name and orders of steps and this causes me to have to change things in code blocks.

Thank you.

user-1 · March 31, 2023, 12:57am

Thanks a lot for the feedback, Tim! I’m processing and will share updates as I address each issue

user-1 · March 31, 2023, 12:57am

One other final input on this. I tried to delete a data store. After typing the name in the model, the button activates, I click it and nothing seems to happen. A brief busy signal on the button, but then nothing. Modal still there. Data store still there.

user-1 · March 31, 2023, 12:57am

let me look into that too, thanks

user-1 · March 31, 2023, 12:57am

I first read the overall Data Store documentation. At the bottom of this entry it indicated Data Store was only available in Node.js. Data Stores - Pipedream
Fixed

user-1 · March 31, 2023, 12:57am

Again at the bottom indicating it was only available in Node.js
Fixed

user-1 · March 31, 2023, 12:57am

After some time I clicked on the actual name of the data store and then saw the data that was recorded. So, this wasn’t obvious, to me at least, at first.
Makes sense why that’s confusing, sharing the feedback with our team

user-1 · March 31, 2023, 12:57am

doing “return data_store” didn’t actually create an Export entry
Yes, it would be nice if the representation were better here. I’m noting that, and in the meantime, I’m adding code examples (Node.js + Python) for each of the existing Data store actions. I agree more examples would be better

user-1 · March 31, 2023, 12:57am

Maybe another question on data store, or clarification. For an entry on Data Store, I stored a dictionary. It gets rendered like the attached screen shot. In Python i was thinking I could do something like data_store[“request”][“ackEmailEnable”]. Though I’m not getting an error, it seems to not be storing a new value in another code block in my work flow. I may have to rethinking and just store straight key/value instead of nested key/value.

user-1 · March 31, 2023, 12:57am

We should store nested properties correctly. What data were you trying to store in that nested key? You can only store JSON-serializable data (e.g. strings and numbers, nested objects, etc, but not complex objects)

user-1 · March 31, 2023, 12:57am

This Python code worked to store a nested object for me:

def handler(pd: "pipedream"):
  data_store = pd.inputs["data_store"]

  d = {
    "foo": {
      "bar": 3
    }
  }

  data_store["hello"] = d

user-1 · March 31, 2023, 12:57am

data_store["request"]["ackEmailEnable"] unfortunately work, since request is the key (i.e. keys of the data store itself are at the top-level). So if you want to update a nested key, you’ll need to do something like this:

def handler(pd: "pipedream"):
  data_store = pd.inputs["data_store"]

  data = data_store["hello"]
  data["foo2"] = True
  data_store["hello"] = data

user-1 · March 31, 2023, 12:57am

so you need to get the existing contents of the key, update the contents, and write the data back to the key

user-1 · March 31, 2023, 12:57am

I’ll add docs on that, as well

user-1 · March 31, 2023, 12:57am

Got it, Right, different pattern there than what is natural. Thank you for your help

user-1 · March 31, 2023, 12:57am

No problem, I really appreciate the feedback!

user-1 · March 31, 2023, 12:57am

btw I forgot to address this question:

So a variation on data store would be a temporary data store that exists just while an instance of a workflow is running
Have you seen how to export data from a step? For large data, you can also save data to the /tmp directory.

user-1 · March 31, 2023, 12:57am

The Export Data from a step, is the default behavior if I would do a return data , right?

user-1 · March 31, 2023, 12:57am

Then yes, I have done that. What my comment though is when I reference this in other steps, I have to call out the step name, as noted pd.steps["code"]["pokemon"]

user-1 · March 31, 2023, 12:57am

So let’s stay I have three steps. While designing my worlflow I code in step 2 to have it reference step 1 data. Similarly I code step 3 and it references step 2 data. But then I decided to rearrange, I want to do Step 1 then Step 3 and then Step 2. Well, this sort of breaks as Step 3 can’t really reference Step 2. So I was thinking a temporary key/values store that exists for each instance of a work flow. In my case, I’m just building up a larger data structure step by step and then in the final step taking some action on it. So my ordering or steps isn’t strictly dependent in the grand scheme, but instead of daisy chaining this data through each step, I’m now building it up in a data store, then I’ll take the action in my last step and then clear it to get ready for another work flow. Though, come to think of it, what about concurrency of Data Stores across workflows that may be triggered miliseconds apart? Hmm…the data store method may not work now that I think of it that way as I may have to set some flag so it doesn’t get overwritten when one work flow instance is part way through as another starts.