How to Create a Transcript from a Large Video File?

user-1 · December 15, 2023, 3:38pm

This topic was automatically generated from Slack. You can find the original thread here.

Hi all, I am new here. I need to create transcript from a large (525mb) video. As expected, the file is too large. Is there any way around this, please?

user-1 · December 15, 2023, 3:38pm

That’s not too large for Pipedream to handle.

user-1 · December 15, 2023, 3:38pm

Maybe it’s too large for the transcription service you’re using?

user-1 · December 15, 2023, 3:38pm

I am using Whisper

user-1 · December 15, 2023, 3:38pm

Then that would be a question for Whisper. :man-shrugging:

user-1 · December 15, 2023, 3:38pm

Maybe you could strip the visual from your video file, and turn it into an audio file instead. That should make it massively smaller.

user-1 · December 15, 2023, 3:38pm

There’s probably a Node.js or Python library to do that.

user-1 · December 15, 2023, 3:38pm

I have no idea what that is - I am not a developer. I work in finance and looking to automate meeting notes & follow ups.

user-1 · December 15, 2023, 3:38pm

Maybe Pipedream could implement that as a utility

Just take a video file/path as input, and return an audio file/path as output.

user-1 · December 15, 2023, 3:38pm

Sounds like it might be useful for many people…

user-1 · December 15, 2023, 3:38pm

Not a super huge fan of ffmepg (it can often be tricky to install) but this library might do the trick: https://www.npmjs.com/package/ffmpeg-extract-audio

user-1 · December 15, 2023, 3:38pm

could you show us exactly what Pipedream action / code you’re using, and a screenshot of the error?

Our “Create Transcription” action automatically splits the file into chunks and sends each chunk to OpenAI to avoid hitting Whisper’s size limits. I’m curious if you’re using that and still seeing an error, or perhaps using different code?

user-1 · December 15, 2023, 3:38pm

Hi Dylan, sure!

user-1 · December 15, 2023, 3:38pm

user-1 · December 15, 2023, 3:38pm

user-1 · December 15, 2023, 3:38pm

The file is 521.1mb in my google drive. As far as I understand, I haven’t messed with any of the code.

user-1 · December 15, 2023, 3:38pm

Thanks. Could you also do me a favor and share your workflow with Support, then DM me the workflow’s URL? I’d like to take a look at some of the debugging data

user-1 · December 15, 2023, 3:38pm

Thanks ! I shipped a small change to the logic for how we split the original file. If you visit your workflow again and edit it, you’ll see a big red Update button at the top-right of the Create Transcription step. Click that to use the latest version, and then test again with that same file. Let me know if that works.

user-1 · December 15, 2023, 3:38pm

Thanks Dylan - trying now!

user-1 · December 15, 2023, 3:38pm

Apologies, testing it taking a while!