[FFmpeg-user] Efficiently doing thousands of edits?

John Hawkinson jhawk at alum.mit.edu
Wed Apr 10 01:36:38 EEST 2019


I have a multi-camera video (hard-coded burned into quadrants) where one camera frequently glitches to black, which is very distracting. I'd like to fix the video so the black flashing doesn't occur, the obvious way is to hold the last frame prior to each selection of black.

I'm going to describe what I did and invite commentary on how I could have done it better, and also describe where I'm now stuck on what seems to be a performance problem.

You can see this problem at https://youtu.be/K2UUQAy3NxI?t=2643 from 44:03 to 44:13.

I took that ten-second chunk to experiment on, and it seemed like I could use the "blackdetect" filter to find these frames, and then use the trim, loop, and concat filters to hold the last good frame. This was frustrating to set up because it was the loop filter parameters in frame count but blackdetect reports them in time, and trim seems like it should take them both ways, but using frames didn't work for me initially.

    # Getting a 10-second chunk from the origin of this video:
    ffmpeg -ss 44:03 \
        -i  'http://usccameraspd.edgesuite.net/mm/flvmedia/3697/b/r/6/br6fmo56_ja85a1r2_h264_2328K.mp4' \
	-t 10 -c copy start.mkv

I spent...way too much time trying to figure out how to run the "blackdetect" filter properly and get the data out. It
seems like an ffmpeg bug that doing it this way (obviously crude, but should work) is a trap for the unwary:

    # Running blackdetect and grepping the ffmpeg output, naively
    ffmpeg -i start.mkv -vf blackdetect=d=0.016:pic_th=0.7 \
    -f null /dev/null 2>&1 |
    grep blackdetect

The trap is if you run it through "cat -vet" you'll see you get a blackdetect log line joined with a frame output log line:

    frame=  270 fps=0.0 q=-0.0 size=N/A time=00:00:09.14 bitrate=N/A speed=18.3x    ^M[blackdetect @ 0x7f806cc0b000] black_start:9.076 black_end:9.409 black_duration:0.333$

This is resolvable with

    # Running blackdetect and grepping the ffmpeg output, corrected
    ffmpeg -i start.mkv -vf blackdetect=d=0.016:pic_th=0.7 \
    -f null /dev/null 2>&1 |
    tr \\015 \\012 |
    grep blackdetect

but that just feels sketchy. Is that a bug?

Ultimatley I ended up doing it like this:

    # Running blackdetect with ffprobe:
    ffprobe -f lavfi -i 'movie=start.mkv,blackdetect=d=0.016:pic_th=0.7' \
    -show_entries frame_tags=lavfi.black_start,lavfi.black_end \
    -of 'flat=s=\ ' -v quiet > detect.txt

Which produces pretty nice output that comes with a frame count (yay, less math!):

    frames frame 3 tags lavfi_black_start="0.1"
    frames frame 12 tags lavfi_black_end="0.4"
    frames frame 17 tags lavfi_black_start="0.567"

although it never produces the final lavi_black_end. Now does it give the black_duration parameter (but of course we can calculate it).

Next, to construct a filtergraph to handle this.

I also found it tough to see exactly what was going on. I ended up burning frame numbers and times into the frames with:

    # Burning timetamps and frame numbers into frames for diagnostics
    ffmpeg -i start.mkv \
    -vf 'drawtext=textfile=text:fontsize=100:fontcolor=white' \

where "text" was a file containing: "F: %{n} %{pts:flt} %{pts:hms}". I didn't have the patience to figure out what level of quoting was necessary to make this work on the command line.

Having that allowed me to single-frame step through the output to make sure that the edits were frame-accurate. The easiest tool for that proofing seemed to be "mpv", as:

    # Proofing a file frame-by-frame
    mpv --osd-fractions --pause \
    --script-opts 'osc-visibility=always,osc-timems=yes' \

I would have liked to use ffplay, as it would make it a lot easier to test filtergraph expressions, but I couldn't find an equivalent of --pause, or on-screen display of the times. Or single frame advance.

After some messing around, this filtergraph seemed to be sufficient to address the first three glitches in my sample, and give me a pattern for improvement:

    # Apply trim/loop/setpts -> concat filtergraph
    ffmpeg -v info \
           -i marked.mkv \
           -filter_complex \ "
    [a][b][c][Z]concat=4" \
    -y p4.mkv

I was a little puzzled why it was necessary to apply the setpts filter to each trimmed component, rather than to the result of the concat filter. Maybe I don't understand how the setpts filter works, but I'm puzzled why concat needs consistent PTS metadata on input?

Anyhow, the result was I scripted this, probably not in the best way:

    # Prouce filtergraph from the blackdetect output (flat format)
    < detect.txt sed 's/[="]/ /g' | awk '
    BEGIN{ n=0;};
    /black_start/{ st[n]=$6; sf[n]=$3 };
    /black_end/  { et[n]=$6; ef[n]=$3; n++ };
      for (i=0; i<n; i++) {
        if (i==0) { trimin=0 } else { trimin=et[i-1] }
        printf "[0:v]trim=%s:%s", trimin, trimout;
        looplen = ef[i]-sf[i];
        loopin  = (trimout-trimin)*30-1
        # ugh rounding.
        printf ",loop=%d:1:%d", looplen, (loopin+0.5);
        printf ",setpts=PTS-STARTPTS"
        printf "[n%d];", i
        printf "\n";
      printf "[0:v]trim=%s,setpts=PTS-STARTPTS[Z];\n", et[n-1]
      for (i=0; i<n; i++) {
         printf "[n%d]", i;
      printf "[Z]concat=%d", n+1
      # catchatall
    }' > filter

And that gave me a 6,690-input concat filter. Of course that's a problem:

    line 1: /usr/local/bin/ffmpeg: Argument list too long

But fair enough. I split it into three chunks, and ran the first chunk,
and only a little bit surprising, performance is ABYSMAL:

    # Applying thousands of edits
    ffmpeg -v info \
           -i $IN \
           -filter_complex "$(cat filter)" \
    -y br6-t1.mkv 
    ffmpeg version 4.1.2 Copyright (c) 2000-2019 the FFmpeg developers
      built with Apple LLVM version 9.0.0 (clang-900.0.39.2)
      configuration: --prefix=/usr/local/Cellar/ffmpeg/4.1.2 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags='-I/Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home/include/darwin' --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librubberband --enable-libsnappy --enable-libtesseract --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-videotoolbox --disable-libjack --disable-indev=jack --enable-libaom --enable-libsoxr
      libavutil      56. 22.100 / 56. 22.100
      libavcodec     58. 35.100 / 58. 35.100
      libavformat    58. 20.100 / 58. 20.100
      libavdevice    58.  5.100 / 58.  5.100
      libavfilter     7. 40.101 /  7. 40.101
      libavresample   4.  0.  0 /  4.  0.  0
      libswscale      5.  3.100 /  5.  3.100
      libswresample   3.  3.100 /  3.  3.100
      libpostproc    55.  3.100 / 55.  3.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'br6fmo56_ja85a1r2_h264_2328K.mp4':
        major_brand     : isom
        minor_version   : 1
        compatible_brands: isomavc1mp42
        creation_time   : 2019-04-08T15:51:06.000000Z
      Duration: 02:29:45.63, start: 0.000000, bitrate: 2330 kb/s
        Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 2199 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
          creation_time   : 2019-04-08T15:29:31.000000Z
        Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 127 kb/s (default)
          creation_time   : 2019-04-08T15:19:42.000000Z
    Stream mapping:
      Stream #0:0 (h264) -> trim (graph 0)
      Stream #0:0 (h264) -> trim (graph 0)
      Stream #0:0 (h264) -> trim (graph 0)
      concat (graph 0) -> Stream #0:0 (libx264)
      Stream #0:1 -> #0:1 (aac (native) -> vorbis (libvorbis))
    Press [q] to stop, [?] for help
    [libx264 @ 0x7fb7bc744200] using SAR=1/1
    [libx264 @ 0x7fb7bc744200] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    [libx264 @ 0x7fb7bc744200] profile High, level 3.1
    [libx264 @ 0x7fb7bc744200] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
    Output #0, matroska, to 'br6-t1.mkv': 0kB time=-577014:32:22.77 bitrate=  -0.0kbits/s speed=N/A    
        major_brand     : isom
        minor_version   : 1
        compatible_brands: isomavc1mp42
        encoder         : Lavf58.20.100
        Stream #0:0: Video: h264 (libx264) (H264 / 0x34363248), yuv420p(progressive), 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 29.97 fps, 1k tbn, 29.97 tbc (default)
          encoder         : Lavc58.35.100 libx264
        Side data:
          cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
        Stream #0:1(und): Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 44100 Hz, stereo, fltp (default)
          creation_time   : 2019-04-08T15:19:42.000000Z
          encoder         : Lavc58.35.100 libvorbis
    [matroska @ 0x7fb7bc742a00] Starting new cluster due to timestampte= 897.2kbits/s speed=0.0663x    
    [matroska @ 0x7fb7bc742a00] Starting new cluster due to timestampte= 903.0kbits/s speed=0.0662x    
        Last message repeated 2 times
    [matroska @ 0x7fb7bc742a00] Starting new cluster due to timestampte= 449.9kbits/s speed=0.132x     
    [matroska @ 0x7fb7bc742a00] Starting new cluster due to timestampte= 450.2kbits/s speed=0.132x    
    frame=13337 fps=1.1 q=29.0 size=   50354kB time=00:14:47.32 bitrate= 464.9kbits/s speed=0.0758x    

(Yes, the complete uncut console output is missing. 3000 duplicate stream mapping lines isn't appropriate for the mailing list.) That's after running for a few hours, it the speed is 0.0758x. And nowhere into finishing the 2.5 hour video.

In comparison, running on the 10 second test with 4 trims, it was:

frame=  150 fps= 62 q=-1.0 Lsize=     498kB time=00:00:10.37 bitrate= 392.7kbits/s speed= 4.3x    

QUESTION: What's the right way to do this efficiently? Presumably there's a point where the concat filter is more efficient than the concat muxer, but presumably much much lower than 3,000 entries. Presumably the right answer involves writing off segments to files?

It also occurred to me that there are snazzier ways to fix this than just frame-holding the last frame.
Maybe split out the two video streams and let one continue unchanged while the other is fixed.
Perhaps recompose them in a better way.
Perhaps do some kind of blend or transition across the glitch rather than a frame hold.
But I figured I'd worry about those after I got the baseline fix working.

And thanks for any suggestions how to do this better.

jhawk at alum.mit.edu
John Hawkinson

More information about the ffmpeg-user mailing list