[FFmpeg-devel] hardware aided video decoding

Michael Niedermayer michaelni
Fri Jul 6 11:40:50 CEST 2007


On Fri, Jul 06, 2007 at 09:45:26AM +0200, Attila Kinali wrote:
> Moin,
> In view of the slowly progressing open graphics project[1]
> I wanted to ask how the people here would accelerate video
> decoding with help of the graphics card. Respectively,
> what functions would you expect the graphics card to have
> to help FFmpeg/lavc to decode video.
> The system should mostly work with MPEG-1/2/4, H.264 and V-C1,
> but should be generic enough to be able to support future
> codec systems.

implement the whole decoder on the card :)
for mpeg1/2/4 bitstream parsing (vlc decoding and related stuff) takes
1/3 of the cpu time last time i checked so gains with doing that on
the CPU and transferring to the card would be limited, also h.264
has significantly more complex bitstream parsing so i would guess
the gains are even smaller but instead of guessing i would suggest
that you try a profiler to get some exact awnsers (dont forget to 
disable all inlining ...)

now if we look at just mpeg1/2/4 and the case that you dont want
to implement the whole decoder on the card ...
then the most obvious things to do are:

do the RLE + zigzag/alt scan decoding of coeffs and the IDCT on the card
if you do just the IDCT on the card then you have to transfer 3+ times
the data from the cpu to the card as IDCT coeffs are 16bit and there
are as many as pixels, if you do the RLE & zigzag stuff on the card too
then there would be significantly less data be transmitted as 95% or
so of the coeffs are 0 and as the coeffs are stored as vlc coded 
zero run + sign + level + last_bit in the bitstream

the next obvious step is to the motion compensation on the card too
for mpeg1/2 and simple profile mpeg4 this should be easy
mpeg4 ASP adds gmc/qpel which is much more complex

note! if you do not do MC on the card and the result of IDCT +
user provided MC frame ends in video memory then the CPU doing MC
of the next frame has to read from video mem somehow

now h.264 does not contain anything shareable with mpeg1/2/4
both idct and MC is different

also for h.264 doing just IDCT is likely not going to work, that is
having intra prediction done on the cpu which needs to read from the
previous 4x4 IDCT result is just going to be a nightmare

and idct+mc+intra preiction would still require the cpu to read and write the
whole frame to apply the loop filter ...

for VC1 ask kostya ...

> Yes, i had a look at the few hardware h.264 decoders around,
> but those seem all to be build around a CPU or DSP core with
> a few additional special instructions needed for decoding.

yes, put a CPU and DSP on the card that should do too :)
and dont forget adding special instructions for CABAC and MC

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070706/ba23e344/attachment.pgp>

More information about the ffmpeg-devel mailing list