Finding Pauses in Videos: The Problem

This is a problem that introduced me to Fourier analysis and machine learning, with several productive detours along the way.

The company I work for produces training videos, each of which is about 350 minutes long.  We needed to package the videos for a new distribution platform that required us to split the videos into smaller 5 to 15 minute segments.

Fortunately, we had written down time-stamps for when instructors switch to new topics.   I knew that the ffmpeg tool on Linux could segment MP4 videos, and we had the time-stamps stored in an XML file.  So I wrote a Python script to load the time-stamps, split the videos using ffmpeg, and package the videos (and other metadata) for the distributor.

Everything seemed to work, until I watched the split videos.

As it turned out, the time-stamps were accurate to within 2 seconds.  That’s a very long time when you’re listening to someone speak — enough time to say several words.  So the script was splitting the videos mid-word.  It sounded terrible.

Unfortunately, there were hundreds of time-stamps to check (the XML file is currently 7,774 lines long).  So I wanted run a program that would check the time-stamps automatically.  If the instructor was speaking, then I’d need to fix the time-stamp.  Otherwise, I’d assume that the time-stamp was okay.

My first thought was: check the amplitude of the sound wave.  If it’s loud, then the instructor is probably talking.  Otherwise, it’s probably a pause.  Here’s the problem, though:

Speaking: speaking sound wave
Pause:

Both sound waves have similar amplitude, but one sounds like white noise, and the other is a person speaking.  Many of the videos had a lot of white noise in the background — so much that the pause sound waves often had greater average amplitude than the speaking sound waves.

In the next post, I’ll describe the solution I eventually found.

3 thoughts on “Finding Pauses in Videos: The Problem

  1. Pingback: Finding Pauses: Logistic Regression Code

  2. Pingback: Finding Pauses: Spectrum Analysis Code

  3. Pingback: Finding Pauses: The Solution (1)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code lang=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" extra="">