This topic was automatically generated from Slack. You can find the original thread here.
We’re having a really hard time uploading 100k records into a datastore. It’s a list of zip codes. Can someone please lend us a hand in the most effective way to get this uploaded? @U04EQQMEAMS
when we try to upload the records in bulk or broken up in chunks often it will crash part way and then we can’t save state very easily. From what I can tell in the documentation there’s no way to upload the file directly. The file is only 6 MBs so we’re not crazy
Hi , could you share your code that updates the data store and the specific error you’re getting?
You’re correct that there’s not a native file upload for data stores but I agree that would be great to see. At the very least I think you should be able to clone a workflow to do it for common file formats like CSVs and JSON.
this is our code **def** handler(pd: "pipedream"): # Access the data store under the pd.inputs data_store = pd.inputs["data_store"]
`**for** item **in** pd.steps["trigger"]["event"]["body"]:`
_`# Store a value under a key`_
`data_store[item["key"]] = item["value"]`
_`# Retrieve the value and print it to the step's Logs`_
`**print**(data_store[item["key"]])`
This is the error: KeyError
Traceback (most recent call last):
File "/nano-py/pipedream/worker.py", line 118, in execute
user_retval = handler(pd)
File "/pipedream/dist/code/58ecef3d76943ec5c6bca4ce78a6d08903b900488911d6945eebe52b7d03784b/code.py", line 10, in handler
print(data_store[item["key"]])
File "/nano-py/pipedream/pipedream.py", line 237, in __getitem__
raise KeyError
KeyError
(python)
We of course could throw in try/except, but I think the question is more about why we need that for something that seems so simple?
I understand that you’re facing issues with uploading 100k records into a datastore and encountering a KeyError. Based on the code and error you’ve provided, it seems that there might be an issue with the keys in the items you’re processing. To help you troubleshoot this issue, I would recommend adding some error handling and logging to your code. This will help you identify the specific item causing the issue. Here’s an example of how you can modify your code: ```
def handler(pd: “pipedream”): data_store = pd.inputs[“data_store”] for item in pd.steps[“trigger”][“event”][“body”]: try: data_store[item[“key”]] = item[“value”] print(data_store[item[“key”]]) except KeyError: print(f"Error processing item: {item}")
You may have also seen, but there’s also probably not a key field in the item that you’re iterating over if you’re hitting a KeyError. So you can handle the absence of that. @U05A7KLNALC How do I best handle the absence of the key field in item?
To handle the absence of the key field in item, you can use a conditional check to ensure the key field exists before processing the item. Here’s an example of how you can modify your code to handle this case: def handler(pd: "pipedream"): data_store = pd.inputs["data_store"] for item in pd.steps["trigger"]["event"]["body"]: if "key" in item and "value" in item: data_store[item["key"]] = item["value"] print(data_store[item["key"]]) else: print(f"Error processing item: {item}, missing 'key' or 'value' field") This code checks if both the key and value fields are present in the item before processing it. If either field is missing, it prints an error message with the problematic item.
I don’t know — did you happen to select a different event from the trigger step?
The KeyError from that specific code should have thrown because of the absence of a key field in item and doesn’t look like it’s an error with our data store API. But if you observe something similar and can confirm that the item in question indeed does have a key, but we’re still throwing this error, definitely let me know.
Thanks. I do see an error from our load balancer updating that record on that one attempt. But I don’t see any issue with the data store handling that load / no errors on our end. But if you see it again with large updates, please let me know.