[FFmpeg-devel] [PATCH]levc/hevc_cabac Optimise ff_hevc_hls_residual_coding (especially ARM)
jc at kynesim.co.uk
Tue Jan 19 13:46:13 CET 2016
I've just done a fair bit of work on hevc_cabac decode for the Rasberry
Pi2 and I think that the patch is generally applicable. Patch is
attached but you may prefer to take it from git:
On the Pi2 playing a 10Mbit 1080p H.265 clip (A bit of the Hobbit) it
reduces the time in ff_hevc_hls_residual_coding (until transform) from
~26Gcycles to ~18Gcycles and it almost halves the time spent in the
"core" bit of the function (from decoding the greater1 bits to the end
of decode). This was measured using the CPU cycle counter. Tests done
at Rasberry Pi suggests that on their ffmpeg branch it reduces overall
CPU loading by ~10% whislt playing H.265. I haven't profiled it on any
other platform - but I would expect useful improvements on most streams
on most platforms.
I have not yet run fate over it as I haven't yet finished downloading
the samples (the internet connection here isn't wildly fast), but I have
run it against the H265.1 conformance streams on both x86 and ARM and it
causes no regressions.
Known unknowns / possible issues:
1) I haven't tested it on anything with 64-bit ints (I don't have an
appropriate m/c) - whilst I've coded in a manner that should hopefully
be OK there I can see that there might be issues.
2) Only tested on gcc 4.8 and later (5.1 & 5.3). I've used an anonymous
union to avoid changing other cabac code - I could believe this was a
no-no and I'll have to change that.
3) Uses clz which doesn't seem to exist in the ffmpeg int libs (though
I'll happily accept suggestions as to what is considered better practice
for these points.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 89120 bytes
Desc: not available
More information about the ffmpeg-devel