[FFmpeg-devel] GPU Hardware Acceleration [was Re: openCL support]
pshirkey at boosthardware.com
Wed Mar 14 03:23:39 CET 2012
Maybe it would help if I rephrased the question.
"Are there any parts of FFMPEG that would benefit from Hardware
Acceleration that do not already have development in process?"
Either CUDA or OpenCL.
Put aside the issue of development time / maintenance effort for now. I'm
just trying to get an idea of what can be done to leverage the GPU to make
FFMPEG an even more powerful tool.
if people can throw a few ideas into the mix then we can later discuss
what would be required to make them viable in terms of time and ongoing
We already have these features:
GPU decoding support
Some progress has been made to port the libx264 lookahead thread
>From the previous thread we have these important points raised:
The challenge will be redesigning all the old the algorithms that are
serial by design to work in parallel. Basic things like an efficient
generic sort are really challenging.
It must be possible using a separate kernel module to do something like
preloading data to the card.
Accessing the gpu memory directly is very slow, because there's no
possible synchronisation with the cpu caches. (Meaning they are disabled
during such a memory access which really hurts)
The only usable path I can see is to offloat some heavy part of the code
to the gpu. Also, gpu still runs in parallel with the cpu.Nothing prevents
using the cpu for bitstream parsing and gpu for block matching in example.
Probably an advantage for HD processing.
Support for ints is important
Easy ones, -huge- performance increase :
Rescaling with various algos, color space conversions, basic deblocking,
More tricky, probably faster by factor of 10 but with quite some
optimisation and dev time : (i)Motion compensation, (i)dct, wavelets ...
The bitstream parsing and arithmetic coding represent an important part of
the process and cannot be ported to gpu easily.
It is much more efficient to transfer to gpu memory first, then process
from there, as the transfer operation doesn't bloc the gpu process queue.
The big advantage of cuda is that you control exactly when the transfer
takes place and it is optimised even for small amount of data.
A possible starting place would be converting a rather simple (compared to
ffmpeg) library like libjpeg.
Boost Hardware Ltd
>> Patrick Shirkey <pshirkey <at> boosthardware.com> writes:
>>> With the recent announcement by AMD that they are actively targeting
>>> source solutions for the HSA (aka Fusion) platform I am wondering if
>>> anyone has any details on the current status of opencCL support in
>> Patch welcome!
> I've done a little more research. It's kind of hard to get started but I
> have found a couple of threads that are useful and IRC had been very
> helpful too.
> IIUC there is already gpu decoding support. I'm looking into how that
> works. Seems documentation is sparse.
> There has also been some effort on porting the x264's lookahead thread on
> the gpu to ffmpeg. However I think that is going to stay x264 specific?
> From the thread above it appears that there are a lot of places to
> optimise but of course motivation is the key factor.
> I would like to ask members of this list if they could help me to compile
> a comprehensive non binding roadmap for the priorities for GPU support in
> FFMPEG and associated libraries. Obviously it is not going to be a panacea
> but there are clearly places where it would be beneficial for people who
> had access to certain hardware and a good reason to utilise it.
> Patrick Shirkey
> Boost Hardware Ltd
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
Boost Hardware Ltd
More information about the ffmpeg-devel