[FFmpeg-devel] [RFC] FFmpeg libavcodec/crystalhd.c: Optimize for reduced latency

thomas schorpp thomas.schorpp at gmail.com
Wed Feb 6 17:12:40 CET 2013


On 04.02.2013 16:31, thomas schorpp wrote:
> On 03.02.2013 03:54, thomas schorpp wrote:
>> On 03.02.2013 02:59, Philip Langdale wrote:
>>> On Sat, 02 Feb 2013 16:18:22 +0100
>>> thomas schorpp <thomas.schorpp at gmail.com> wrote:
>>>>
>>>> I've got it sync'd to the driver and a stable pipeline control loop,
>>>> the TX/RX- ringbuffers balanced with faster and more precise sync
>>>> locking for now with patch #03:
>>>>
>>>> <SNIP>
>>>>
>>>> Why is a pixel format converter called "scaler" in FFmpeg?
>>>>
>>>> But I don't see the cause from the stream input in the stereo3d
>>>> filter source code hitting this switch() case: [stereo3d] stereo
>>>> format of input is not supported I try to excplicitely specify pixel
>>>> format and scaler params.
>>>>
>>>> Comments on this crystalhd.c development patch?
>>>
>>> The pixel format conversion is done by the scaler because it has to
>>> support that anyway, so even when doing a 1:1 conversion it gets used;
>>> that's just an ffmpeg implementation detail.
>>>
>>> So, yeah - it certainly seems functional - with the big caveat that if
>>> you try and seek in the video with mplayer, it will overflow the input
>>> buffer and grind to a halt. The old code doesn't do that. It's fine
>>> for a transcode run, but not good for interactive playback.
>>
>> Yes, first optimization target is transcoding, noted the mplayer seek break, too, and
>> does not even work as good as your design for transcoding,
>> every random minutes distorted picture and audio out of sync after 20min, fsck.
>>
>> I've trimmed some more and added an extra wait sleep to stop buffer overruns,
>
> Bullshit, revert. Makes it worse. Useless to throttle decoded output with  throttled polling
> of the input buffer to become free with I/O in the same thread.
> Raised decode_wait up to 1s in a long transcoding run.

Fixed by reverting the && decoder_status.ReadyListCount > 0 checks for DtsProcInput() and receive_frame().

Mplayer seek working again, transcoded and filtered streams clean and in A/V sync again.

I think the added trace loggers are useful and changed the levels of some loggers to the matching states and the
BCM decoder picture details to AV_LOG_DEBUG, what I think of is the matching logging state and purpose.

The new "timer" constants (BCM spec'd 16ms opt. in *.h) and "precision" seems to bring a ~10% performance
increase for transcoding, if the fps counter is right,
looks like my code keeps the pipeline somewhat shorter on transcoding but no longer really balanced, too:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/mnt/data/test.1080p.3D.HSBS.x264.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 1
     compatible_brands: isomavc1
     creation_time   : 2012-10-02 20:24:46
   Duration: 02:03:45.77, start: 0.000000, bitrate: 2199 kb/s
     Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 2105 kb/s, 23.98 fps, 23.98 tbr, 96k tbn, 47.95 tbc
     Metadata:
       creation_time   : 2012-10-02 20:24:46
     Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, s16, 91 kb/s
     Metadata:
       creation_time   : 2012-10-02 20:25:22
       handler_name    : GPAC ISO Audio Handler
[Parsed_mp_0 @ 0x1c782e0] 'stereo3d' is a wrapped MPlayer filter (libmpcodecs). This filter may be removed
once it has been ported to a native libavfilter.
[h264_crystalhd @ 0x1c6a420] CrystalHD Init for h264_crystalhd
[h264_crystalhd @ 0x1c6a420] CrystalHD: starting up
Running DIL (3.22.0) Version
DtsDeviceOpen: Opening HW in mode 0
Clock set to 180
Enable single threaded mode
Setting Color Mode to 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: Init complete.
Output #0, mp4, to '/mnt/data/test.1080p.3D.agmd.sbs2l.mp4':
   Metadata:
     major_brand     : isom
     minor_version   : 1
     compatible_brands: isomavc1
     encoder         : Lavf54.29.104
     Stream #0:0(und): Video: mpeg4 ( [0][0][0] / 0x0020), yuv420p, 960x1080 [SAR 1:1 DAR 8:9], q=2-31, 200 kb/s, 24k tbn, 23.98 tbc
     Metadata:
       creation_time   : 2012-10-02 20:24:46
     Stream #0:1(eng): Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, 91 kb/s
     Metadata:
       creation_time   : 2012-10-02 20:25:22
       handler_name    : GPAC ISO Audio Handler
Stream mapping:
   Stream #0:0 -> #0:0 (h264_crystalhd -> mpeg4)
   Stream #0:1 -> #0:1 (copy)

Press [q] to stop, [?] for help
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16050 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16100 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16150 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16200 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16250 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16300 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16350 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16400 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16450 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16500 us  FreeListCount 16 ReadyListCount: 0
[h264_crystalhd @ 0x1c6a420] No frames ready. Current delay: 16550 us  FreeListCount 16 ReadyListCount: 0
Input stream #0:0 frame changed from size:1920x1080 fmt:yuv420p to size:1920x1080 fmt:yuyv422
[Parsed_mp_0 @ 0x21aee00] 'stereo3d' is a wrapped MPlayer filter (libmpcodecs). This filter may be removed
once it has been ported to a native libavfilter.
CrystalHD: Picture Number discontinuity. Current delay: 16550 us e= 335.6kbits/s dup=3 drop=0
[h264_crystalhd @ 0x1c6a420] CrystalHD: Picture Number discontinuity. Current delay: 16550 us
CrystalHD: decode_frame.0 size=    3555kB time=00:00:22.54 bitrate=1291.5kbits/s dup=31 drop=24

[h264_crystalhd @ 0x1c6a420] CrystalHD: decode_frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: parser picture type 3
[h264_crystalhd @ 0x1c6a420] input "pts": 2380400000
[h264_crystalhd @ 0x1c6a420] Frames Ready. Current delay: 16550 us  FreeListCount 10 ReadyListCount: 4
[h264_crystalhd @ 0x1c6a420] CrystalHD: RX loop. Current delay: 16550 us  FreeListCount 10 ReadyListCount: 4
[h264_crystalhd @ 0x1c6a420] 	Frames to Drop: 0
[h264_crystalhd @ 0x1c6a420] output "pts": 2379900000
[h264_crystalhd @ 0x1c6a420] output picture type 3
[h264_crystalhd @ 0x1c6a420] Interlaced state: 0 | trust_interlaced 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: Copying out frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: Pipeline length (has_b_frames): 13
[h264_crystalhd @ 0x1c6a420] CrystalHD: Decoded OK. Current delay: 16550 us  FreeListCount 13 ReadyListCount: 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: decode_frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: parser picture type 3
[h264_crystalhd @ 0x1c6a420] input "pts": 2380500000
[h264_crystalhd @ 0x1c6a420] Frames Ready. Current delay: 16550 us  FreeListCount 10 ReadyListCount: 4
[h264_crystalhd @ 0x1c6a420] CrystalHD: RX loop. Current delay: 16550 us  FreeListCount 10 ReadyListCount: 4
[h264_crystalhd @ 0x1c6a420] 	Frames to Drop: 0
[h264_crystalhd @ 0x1c6a420] output "pts": 2379700000
[h264_crystalhd @ 0x1c6a420] output picture type 3
[h264_crystalhd @ 0x1c6a420] Interlaced state: 0 | trust_interlaced 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: Copying out frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: Pipeline length (has_b_frames): 13
[h264_crystalhd @ 0x1c6a420] CrystalHD: Decoded OK. Current delay: 16550 us  FreeListCount 11 ReadyListCount: 3
[h264_crystalhd @ 0x1c6a420] CrystalHD: decode_frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: parser picture type 3
[h264_crystalhd @ 0x1c6a420] input "pts": 2380600000
[h264_crystalhd @ 0x1c6a420] Frames Ready. Current delay: 16550 us  FreeListCount 8 ReadyListCount: 6
[h264_crystalhd @ 0x1c6a420] CrystalHD: RX loop. Current delay: 16550 us  FreeListCount 8 ReadyListCount: 6
[h264_crystalhd @ 0x1c6a420] 	Frames to Drop: 0
[h264_crystalhd @ 0x1c6a420] output "pts": 2380000000
[h264_crystalhd @ 0x1c6a420] output picture type 3
[h264_crystalhd @ 0x1c6a420] Interlaced state: 0 | trust_interlaced 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: Copying out frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: Pipeline length (has_b_frames): 13
[h264_crystalhd @ 0x1c6a420] CrystalHD: Decoded OK. Current delay: 16550 us  FreeListCount 10 ReadyListCount: 4
[h264_crystalhd @ 0x1c6a420] CrystalHD: decode_frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: parser picture type 3
[h264_crystalhd @ 0x1c6a420] input "pts": 2380700000
[h264_crystalhd @ 0x1c6a420] Frames Ready. Current delay: 16550 us  FreeListCount 7 ReadyListCount: 7
[h264_crystalhd @ 0x1c6a420] CrystalHD: RX loop. Current delay: 16550 us  FreeListCount 7 ReadyListCount: 7
[h264_crystalhd @ 0x1c6a420] 	Frames to Drop: 0
[h264_crystalhd @ 0x1c6a420] output "pts": 2380100000
[h264_crystalhd @ 0x1c6a420] output picture type 3
[h264_crystalhd @ 0x1c6a420] Interlaced state: 0 | trust_interlaced 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: Copying out frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: Pipeline length (has_b_frames): 13
[h264_crystalhd @ 0x1c6a420] CrystalHD: Decoded OK. Current delay: 16550 us  FreeListCount 10 ReadyListCount: 4
[h264_crystalhd @ 0x1c6a420] CrystalHD: decode_frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: parser picture type 3
[h264_crystalhd @ 0x1c6a420] input "pts": 2380800000
[h264_crystalhd @ 0x1c6a420] Frames Ready. Current delay: 16550 us  FreeListCount 7 ReadyListCount: 7
[h264_crystalhd @ 0x1c6a420] CrystalHD: RX loop. Current delay: 16550 us  FreeListCount 7 ReadyListCount: 7
[h264_crystalhd @ 0x1c6a420] 	Frames to Drop: 0
[h264_crystalhd @ 0x1c6a420] output "pts": 2379600000
[h264_crystalhd @ 0x1c6a420] output picture type 3
[h264_crystalhd @ 0x1c6a420] Interlaced state: 0 | trust_interlaced 1
[h264_crystalhd @ 0x1c6a420] CrystalHD: Copying out frame
[h264_crystalhd @ 0x1c6a420] CrystalHD: Pipeline length (has_b_frames): 13
[h264_crystalhd @ 0x1c6a420] CrystalHD: Decoded OK. Current delay: 16550 us  FreeListCount 13 ReadyListCount: 1

priv->decode_wait of 16550 µs is being held stable for 30min of transcoding now, no more "No frames ready." and
"Picture Number discontinuity." states so far, but that may depend on the encoder quality of the input stream.

Please test and confirm, can anyone test if this crystalhd decoder works with Bino now
(h264_crystalhd codec force patch is in the Bino bugtracker, without Bino takes the FFmpeg h.264 software codec),
did not work here with Philip's timing, only still pictures on seeking and Bino decoder timeout messages,
it's always crashing in assertions for XCB(?) at GUI actions on my debian stable/bpo environment.

I'm investigating the broadcom libcrystalhd3 for possible optimizations now, but FFmpeg performs at 30fps already
using
ffmpeg -c:v h264_crystalhd -i test.1080p.x264.mp4 -an -f rawvideo -y /dev/null
This should be the maximum hardware performance anyway.

y
tom

Patch v0.6 against debian ffmpeg-dmo-1.0.1 attached, sorry git HEAD breaks to many apps on debian stable/testing(?),
I cannot submit untested patches.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: crystalhd-latencyopt-experimental.06.schorpp.patch
Type: text/x-diff
Size: 13788 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130206/5c9806ba/attachment.bin>


More information about the ffmpeg-devel mailing list