Can the Files API Perform Concurrent Updates Successfully?

This topic was automatically generated from Slack. You can find the original thread here.

Hi all,

Does the Files API have ability to do concurrent updates .i.e if file is updated in two workflows at the exact same time will both updates be successful? I have a workflow that does a lot file updates and some updates are missing.

Hi Hithesh, good question. Yes the Files API should be able to handle concurrent uploads, it’s using S3 pre-signed URLs to perform the uploads.

Is it possible that you’re using the same file name across these workflows? That might cause a overwriting issue.

Hi Pierce,
Thanks for the info. Do you mean the local file name that is used when updating file from File store?

I mean the name passed to the $.files.open function:

const fileName = 'myfile.txt'

$.files.open(fileName).fromUrl("https://image.com/example.jpeg")

In that example. if multiple workflows are all opening and writing to myfile.txt then that file would be overridden because it’s the same file path within the File Store.

Just making sure you’re using unique file names, ideally per execution and workflow

My uses case involves appending to the same file. Would this be an issue?

Yes that potentially could be an issue.

The File Stores API doesn’t support an append operation, so your file will be overwritten with the latest data.

Or if you’re retrieving the file then adding data to it within the step, then it’s still subject to a race condition with other workflows modifying that same file

Is this supposed to act as a log?

Ya I think this is what could be going on. I am basically reading the content of the file , adding the new content and then writing this content to the File store.

It’s being used as a data processing file that is then sent to another app. The thing is that there is a lot of transaction in a short space of time.

Yes that definitely sounds like that might be the cause.

You might have better luck using AWS Cloudwatch as a logger, then querying the data over API to create the file to push to this other app.

import AWS from 'aws-sdk'

export default defineComponent({
  props: {
    aws: {
      type: "app",
      app: "aws",
    }
  },
  async run({steps, $}) {
    const { accessKeyId, secretAccessKey } = this.aws.$auth
    
    /** Now, pass the accessKeyId and secretAccessKey to the constructor for your desired service. For example: **/
    
    const cloudwatchlogs = new AWS.CloudWatchLogs({
      accessKeyId, 
      secretAccessKey,
      region: 'us-east-1',
    })

    const logEvents = steps.your-step.$return_value // or the JSON data you'd like to log

    return new Promise((resolve, reject) => {
      var params = {
        logEvents,
        logGroupName: 'your-app/log-group-name', /** required **/
        logStreamName: 'your-app/log-stream-name'', /** required **/
      };
      cloudwatchlogs.putLogEvents(params, function(err, data) {
        if (err) {
          console.error(err, err.stack)
          reject(err, err.stack)
        } else {
          $.export('loggedEvents', logEvents)
          resolve(data);
        }
      });
    }) 
  },
})

Got it. Thanks . If I use one workflow and limit concurrency and/or execution, do you think it might avoid race condition?

Yes, potentially that’s another way. If you limit all writing to this file to one workflow and control that workflow’s concurrency that could eliminate the race condition

Cool. I’ll give that a go. Thanks for your help.

Sure thing :slightly_smiling_face: