I’m watching the video now. It’s absolute fire! And the breakdown of pipedream along the way is gold. I’ve been brainstorming a few open-ai workflows and this is serious inspiration. Thanks for sharing Morgan!
Thanks for the video. I’m watching right now and reading around the Whisper docs/forums. It seems it’s not capable of handling 2 voices. Specifically its overly capable of dealing with 2 voices to the extent it doesn’t separate them in the transcript. Having separate voices in the transcript would be very useful (interviews, podcasts etc).
OpenAI’s own Whisper can’t do this, but Deepgram’s models can! They actually claim their new Nova model is even more accurate than Whisper, and they also offer their own hosted version of Whisper that costs less than OpenAI’s. (Nova costs even less)
I’m not 100% sure if their Whisper model can do it, but I know their Nova model can. I was talking with them a bunch before I released this tutorial, as the AI community in general has been struggling with getting Whisper to do diarization and accurate timestamps for captions
Love this application use case. Followed Thomas Frank’s video exactly and it was enlightening, so much so that my coworkers are wanting to setup this exact workflow now. Only challenge I have ran into is that sometimes my meeting recording is over an hour long and I cant get the full trancription done as my OpenAI step fails due to lack of tokens. Only way I could get it to work was to use Audacity and really trim down the recording first. Not sure if there is another way to handle this but thought I would ask.
Nice work! I attempted a Pipedream transcription workflow by YouTuber, Thomas Frank:
But I wanted to use Deepgram and Obsidian instead. Unfortunately I can’t get it to work. The 2 blogs from Deepgram are too code heavy for me. But Pipedream to the rescue because we can now share our workflows:
I’d be super- grateful if you’d consider sharing your working workflow!