[FFmpeg-user] mp4 concatenation (muxer) problems -- invalid output?

John Hawkinson jhawk at MIT.EDU
Mon Feb 20 15:31:41 EET 2017

I want to be clear -- I only gave my goal (stitching multiple videos
together) so you'd have context. I'm not asking for help with the
overall task, merely with the concatation problem. I don't mean to
sound ungreatful, and Erik, I've tried to answer all your questions,
even though they seem to be taking me further away from the original
problem, not closer to a solution.

My question specifically is why (1) extracting a still; (2)
frame-holding it with "-loop 1 -i f1.png -t 3.933 -pix_fmt yuv420p -r
29.97"; and (3) attempting to concatenate it with the original mp4
with "-f concat -i concat7 -c copy" ===> results in an invalid mp4.
Far too long as reported by the ffmpeg concat process; far too long as
reported by ffprobe; does not play properly in VLC.

My assumption is something is wrong at step (2), but I don't know that to be
the case.

Erik Dobberkau <erik.dobberkau at gmail.com> wrote on Mon, 20 Feb 2017
at 07:56:41 +0100 in <CAOz0ov+aubY+k=bZm1PE1vn1fjTpF7p0wrm5B-2=yUGLysBSfg at mail.gmail.com>:

> Obviously you're trying to concat recordings from a DSLR.
> -> Why are there gaps in between? They shouldn't be there, if the camera is
> continuously on record.

This isn't really relevant to ffmpeg, and it's a constraint I have to
live with. But the answer is the camera in question (not a DSLR) is
limited to 30-minute recording chunks, because of MPEG patent licensing
restrictions in the firmware.

> When done with the video, you want to replace the audio (of the loooong
> video file) with a continuous mp3 audio track that is in sync all of the
> time?

Again, not relevant to ffmpeg, but yes. Possibly adjustment of the audio
timebase may be required in some cases for optimal sync, but my
experience has been that it's not necessary in my application.

> You can only concat files with matching parameters (and different moov
> atoms if they're not raw files, like transport streams) if you want to get
> a decent result,

Right, understood. The question is why the frame-held still does not
appear to be compatible with the original mp4.

(Although as an aside, the ffmpeg documentation throws around the phrase
"raw" a fair bit, and it's not really clear as a reader what exactly
it means. Which formats constitute "raw" and in what contexts? How are
you supposed to know or figure it out? It would be great if ffprobe could
report it definitively.)

> which means you have to do the following:
> - Get the exact timecode difference between each of the adjacent files ( ->
> exiftool is your friend, use ffprobe and exiftool in a nice script to do
> the math... )

Again, not relevant to the problem. As noted originally, I get this
data via another mechanism (PrPro -> EDL export).

> - Based on that difference, create matching splice files from the first
> frame of the "next" clip for each gap with matching video 

Yes, here's what I am asking about, because, again, the "splice files"
I create do not seem to work properly with the concat mixer.

> and audio parameters to the source clips (-> use anullsrc to add
> blank audio to an image). 

I'm not really sure why the audio matters much, since it's getting
stripped out anyhow. A MP4 with no audio is compatible with an MP4 with
some aac audio, right? In any event, as I reported, having stripped
the audio from the video file, I continue to have concatenation
problems with two MP4s that both lack audio.

> Your mileage will vary here because DSLR cameras use evil magic to
> record video... 

I'm not quite sure what you're getting at here (also still not a DSLR).
Are you suggesting the MP4 from the camera is sufficiently "damaged"
because of realtime codec issues that it requires some form of
transcoding to be usable in any way? That would be unfortunate, but if it's
true, what's the minimal operation required?

> your ffmeg cmd line will (need to) be much longer than it is now.

Why? But more sigificantly, in what way?

But this is again out of scope for the question, which is why
concatenating a single frame-held still (your term: "splice file")
with a single video file produces a broken result.

> I don't think it's likely to get a single-line solution for your task.

I'm not seeking one. What I am seeking is a successful way to
concatenate a held still with an mp4 video that produces a valid
result, with minimal transcoding.


Oh, right, you started with:
> just to make sure I get this right, are you trying to build an
> advanced version of Vantage Camera Ingest? Have fun then... ;-)

I'm not familiar with it but I looked at the cutsheet just now.  I'd
say "not really" since generally my timebases aren't good enough for
fully automatic syncing, and so I end up with manual adjustments to
the time sync (often comparing the audio tracks between the video
clips and the master audio). But certainly there are parallels.

--jhawk at mit.edu
  John Hawkinson

More information about the ffmpeg-user mailing list