[FFmpeg-user] Adding narration to a video
John G. Heim
jheim at math.wisc.edu
Mon Dec 2 19:33:42 EET 2024
Hi, as unlikely as it might seem, I'm a blind person trying to create a
promotional video. I'm in a group of disabled rock climbers and I have a
video taken with a GoPro attached to my climbing helmet the first time I
rappelled down a cliff. I want to add narration explaining what is
going on and talking up the group itself. I recorded a bunch of mp3
files with the audio narration. I've been googling for weeks on how to
add those audio files into the video stream but I just cannot get it
exactly right no matter what I do. Here is what I have tried so far:
1. Programmatically cut the video into segments so that a segment cut
out of the middel would exactly match the length of my audio, added the
audio to the segment, and patched the segments back together.
Actually, this works pretty well except that there is a little jump at
each of the splices. Still, I figure there has to be a better way.
2. Programmatically strung together a command to use itsoffset to
insert at the appropriate points. I know, you're saying itsoffset
doesn't work for audio -- but it's not really my fault I thought this
would work, there are lots of posts out there saying it does. I pondered
trying to adapt my script to use adealy in a filter_complex but it got
to be too complicated.
3. Programmatically looped through each sound file, used a
filter_complex clause with adelay to add the audio at the appropriate
time. Each iteration of the loop required about 45 seconds to execute on
my 20-core I5 with 32GB, and an SSD drive. I'd be okay with that but it
didn't work. The original audio was muffled and the clips weren't where
I wanted them.
3. Programmatically used sox to splice the audio files together padding
with silence so the narration would line up with the appropriate places
in the video. If I play the original video and the spliced audio file in
separate processes but at the same time, it sounds perfect, the
narration lines up perfectly with the video. But when I merge them, they
don't line up at all. This one is particularly puzzling. Not only does
the narration not line up, some of the clips are repeated. Like in the
audio file, I say, "Here I loos my footing." just once at 1:09. On the
video, it doesn't play until 1:54 but then it plays again at 2:33 and
again at 2:51. Wierd.
Here is the command I used to try this:
ffmpeg -i original.mp4 -i narration.mp3 -c:v copy \
-filter_complex "[0:a][1:a] amix=inputs=2:duration=longest [audioin]" \
-map 0:v -map "[audioin]" \
-y garbage.mp4
I am running Debian testing with ffmpeg version 7.1.3. I am writing
this in bash scripts and using ffprobe to get the length of the audio
segments. The audio files are mp3 files named for the point where I want
them inserted. For example, i want to introduce myself 28 seconds into
the video so that file is named 0028000-intro.mp3. In the script, I can
extract the 28000 and use that in an adelay filter or devide by 1000 for
the -t attribute. I am pretty sure I am doing the math right.
Here is a link to the most successful effort I've had so far via method1:
https://people.math.wisc.edu/~jheim/Climbing/video.mp4
That works pretty well but I figure method 3 has to be the right
approach. Make an audio file with the narration that lines up with the
video. Then add that extra audio track to the original video. Adding an
mp3 file to a segment of the video works fine but it doesn't work when I
try to do the whole thing at once.
What the heck am I doing wrong?
More information about the ffmpeg-user
mailing list