blin

Transcribing talks

Published:

I wanted to get an LLM summary for XOXO Festival. Lisa Hanawalt, BoJack Horseman - XOXO Festival (2015) which turns out to be a perfect opportunity to test out transcribing audio using Whisper!

Here is what I did to get a human readable transcript:

  1. Download the video
    • yt-dlp --extract-audio --audio-format mp3 'https://www.youtube.com/watch?v=f6F_CF7Yvo0' -o talk-audio.mp3
  2. Chunk into 10 minute chunks1
    • ffmpeg -i talk-audio.mp3 -f segment -segment_time 600 -c copy talk-audio-10m_chunks-%03d.mp3
  3. Transcribe the audio using Whisper through huggingface UI
  4. Chunk the text using Claude 3.5 Sonnet
    • cat transcript.txt | llm -m claude-3.5-sonnet -s "Split the content of this transcript up into paragraphs with logical breaks. Add newlines between each paragraph." > transcript-chunked.txt

This is already very useful!

Finishing touch:

cat transcript-chunked.txt | llm -m claude-3.5-sonnet -s 'What are the themes in the given transcript of a talk?'

I’m experimenting with using these kinds of summaries to make it both easier to tremember but also easier to share with people. Instead of sending someone a video with “Check out this talk, I think you will like it”, I can send a video with “Check out this talk, it is about $themes , I think you will like it”.

References

Tidbits

How do I “cite” a youtube video? American Psychological Association recommends a specific format, which I didn’t want to type out by hand, so instead I investigated yt-dlp a bit more:

yt-dlp --skip-download --print "%(channel)s. (%(upload_date>%Y\\, %B %d)s). %(title)s [Video]. YouTube. %(webpage_url)s" 'https://www.youtube.com/watch?v=f6F_CF7Yvo0'


  1. otherwise huggingface UI was failing with an error, I’m assuming a timeout. ↩︎