[FFmpeg-devel] [PATCH] AVHWAccel infrastructure v2 (take 6) [ping^3]

Michael Niedermayer michaelni
Wed Nov 10 16:39:26 CET 2010


On Wed, Nov 10, 2010 at 04:18:45PM +0100, Gwenole Beauchesne wrote:
> Hi,
>
> On Wed, 10 Nov 2010, Michael Niedermayer wrote:
>
>> On Wed, Nov 10, 2010 at 11:01:11AM +0100, Gwenole Beauchesne wrote:
>>> Hi,
>>>
>>> On Tue, 9 Nov 2010, Michael Niedermayer wrote:
>>>
>>>> On Tue, Nov 02, 2010 at 11:05:35AM +0100, Gwenole Beauchesne wrote:
>>>>>
>>>>> On Sat, 30 Oct 2010, Michael Niedermayer wrote:
>>>>>
>>>>>>> On Mon, 20 Sep 2010, Gwenole Beauchesne wrote:
>>>>>>>
>>>>>>>>>> So, HW accelerator selection works as is:
>>>>>>>>>> - Set AVCodecContext.hwaccel_id to the desired accelerator
>>>>>>>>>> - Set AVCodecContext.hwaccel_flags to HWACCEL_CAP_GET_PIXELS if the user
>>>>>>>>>> wants AVFrame.data[0-2] populated with YV12 or NV12 pixels
>>>>>>>>>>
>>>>>>>>>> This means we can handle multiple accelerators automatically at once.
>>>>>>>>>> Anyway, this lived in an ideal world without practical use since the
>>>>>>>>>> user-application would still have to handle the HW accelerator
>>>>>>>>>> specifically.
>>>>>>>>>
>>>>>>>>> [...]
>>>>>>>>>> +    /**
>>>>>>>>>> +     * Called in codec initialization.
>>>>>>>>>> +     *
>>>>>>>>>> +     * This is used to initialize the AVCodecContext.hwaccel_context
>>>>>>>>>> +     * hardware accelerator context.
>>>>>>>>>> +     *
>>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>>> +     */
>>>>>>>>>> +    int (*init)(AVCodecContext *avctx);
>>>>>>>>>> +
>>>>>>>>>> +    /**
>>>>>>>>>> +     * Called in codec finalization.
>>>>>>>>>> +     *
>>>>>>>>>> +     * This is used to clean-up any data from the hardware accelerator
>>>>>>>>>> +     * context. Should this function be implemented, it shall reset
>>>>>>>>>> +     * AVCodecContext.hwaccel_context to NULL.
>>>>>>>>>> +     *
>>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>>> +     */
>>>>>>>>>> +    int (*close)(AVCodecContext *avctx);
>>>>>>>>>> +
>>>>>>>>>> +    /**
>>>>>>>>>> +     * Called at the beginning of each frame to allocate a HW surface.
>>>>>>>>>> +     *
>>>>>>>>>> +     * The returned surface will be stored in Picture.data[3].
>>>>>>>>>> +     *
>>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>>> +     * @param surface the pointer to the HW accelerator surface
>>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>>> +     */
>>>>>>>>>> +    int (*alloc_surface)(AVCodecContext *avctx, uint8_t **surface);
>>>>>>>>>> +
>>>>>>>>>> +    /**
>>>>>>>>>> +     * Called to free the HW surface that was allocated with
>>>>>>>>>> alloc_surface().
>>>>>>>>>> +     *
>>>>>>>>>> +     * @param avctx the codec context
>>>>>>>>>> +     * @param surface the HW accelerator surface
>>>>>>>>>> +     * @return zero if successful, a negative value otherwise
>>>>>>>>>> +     */
>>>>>>>>>> +    void (*free_surface)(AVCodecContext *avctx, uint8_t *surface);
>>>>>>>>>>  } AVHWAccel;
>>>>>>>>>
>>>>>>>>> iam not in favor of these. What are these good for?
>>>>>>>>> Why are they needed?
>>>>>>>>
>>>>>>>> This is used to allocate the HW surfaces (VdpSurface, VASurface, etc.)
>>>>>>>> from within FFmpeg. i.e. the AVFrame.data[3] part, hence the uint8_t *
>>>>>>>> instead of the void *.
>>>>>>>
>>>>>>> Those functions are also used to help user implement
>>>>>>> AVCodecContext.get_buffer() and release_buffer() if he wants to do so
>>>>>>> manually.
>>>>>>>
>>>>>>> e.g. by default, AVCodecContext.get_buffer() will point to
>>>>>>> avcodec_default_get_buffer(). If a user overrides that function, he will
>>>>>>> lose automatic data[3] handling. So, two possible cases:
>>>>>>> 1) He really knows how to do so, so he allocates the HW surface himself
>>>>>>> 2) He can delegate that to FFmpeg that will know how to do that just fine
>>>>>>> too. e.g. calling the right vaCreateSurfaces(), VdpVideoSurfaceCreate(),
>>>>>>> etc.
>>>>>>>
>>>>>>>>>> +/** Identifies the hardware accelerator */
>>>>>>>>>> +enum HWAccelID {
>>>>>>>>>> +    HWACCEL_ID_NONE,
>>>>>>>>>> +    HWACCEL_ID_VAAPI,           ///< Video Acceleration (VA) API
>>>>>>>>>> +    HWACCEL_ID_NB
>>>>>>>>>> +};
>>>>>>>>>
>>>>>>>>> this needs a any or auto for autodetection (which should be 0)
>>>>>>>>
>>>>>>>> I finally forgot about autodetection because it's too complex to handle
>>>>>>>> at the FFmpeg level and delegated that to the player level. In
>>>>>>>> particular, HWAccel context may depend on upper level info, like an X
>>>>>>>> display...
>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +/** Identifies the hardware accelerator entry-point */
>>>>>>>>>> +enum HWAccelEntrypoint {
>>>>>>>>>> +    HWACCEL_ENTRYPOINT_VLD = 1, ///< Variable-length decoding
>>>>>>>>>> +    HWACCEL_ENTRYPOINT_IDCT,    ///< Inverse discrete cosine transform
>>>>>>>>>> +    HWACCEL_ENTRYPOINT_MOCO,    ///< Motion compensation
>>>>>>>>>> +    HWACCEL_ENTRYPOINT_NB
>>>>>>>>>> +};
>>>>>>>>>
>>>>>>>>> as said previously this is ambigous
>>>>>>>>>
>>>>>>>>> HWACCEL_ENTRYPOINT_VLD could mean the hw does just VLD and leaves IDCT/MC
>>>>>>>>> to software
>>>>>>>>
>>>>>>>> Actually, I didn't find the exact meaning. But from a pipeline point of
>>>>>>>> view, this would mean XXX and above. i.e. VLD and above, motion
>>>>>>>> compensation and above. Flagging those would lead to more player work
>>>>>>>> (well, just OR'ing more flags).
>>>>>>>>
>>>>>>>> I tried to express that in the comment.
>>>>>>
>>>>>>>  Makefile    |    2 -
>>>>>>>  avcodec.h   |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>>>>>>  hwaccel.h   |   55 +++++++++++++++++++++++++++++++++++++
>>>>>>>  internal.h  |   14 +++++++++
>>>>>>>  mpegvideo.c |    4 ++
>>>>>>>  options.c   |    2 +
>>>>>>>  utils.c     |   52 +++++++++++++++++++++++++++++++++++
>>>>>>>  7 files changed, 213 insertions(+), 3 deletions(-)
>>>>>>> e890e0f1046043c9a28d3a64d9380146b0316139  ffmpeg.hwaccel.v2.8.patch
>>>>>>> commit 8277742b00c2dbe41fd3af7f25c1c903f4477b43
>>>>>>> Author: Gwenole Beauchesne <gbeauchesne at splitted-desktop.com>
>>>>>>> Date:   Fri Oct 29 06:16:55 2010 +0200
>>>>>>>
>>>>>>>     Update AVHWAccel infrastructure to v2.
>>>>>>>
>>>>>>> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
>>>>>>> index 385ae02..1b64768 100644
>>>>>>> --- a/libavcodec/Makefile
>>>>>>> +++ b/libavcodec/Makefile
>>>>>>> @@ -3,7 +3,7 @@ include $(SUBDIR)../config.mak
>>>>>>>  NAME = avcodec
>>>>>>>  FFLIBS = avutil avcore
>>>>>>>
>>>>>>> -HEADERS = avcodec.h avfft.h dxva2.h opt.h vaapi.h vdpau.h xvmc.h
>>>>>>> +HEADERS = avcodec.h avfft.h dxva2.h hwaccel.h opt.h vaapi.h vdpau.h xvmc.h
>>>>>>>
>>>>>>>  OBJS = allcodecs.o                                                      \
>>>>>>>         audioconvert.o                                                   \
>>>>>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>>>>>> index 705259e..137f485 100644
>>>>>>> --- a/libavcodec/avcodec.h
>>>>>>> +++ b/libavcodec/avcodec.h
>>>>>>> @@ -29,6 +29,7 @@
>>>>>>>  #include <errno.h>
>>>>>>>  #include "libavutil/avutil.h"
>>>>>>>  #include "libavutil/cpu.h"
>>>>>>> +#include "libavcodec/hwaccel.h"
>>>>>>>
>>>>>>>  #define LIBAVCODEC_VERSION_MAJOR 52
>>>>>>>  #define LIBAVCODEC_VERSION_MINOR 93
>>>>>>> @@ -2607,7 +2608,7 @@ typedef struct AVCodecContext {
>>>>>>>       * provided by the user. In that case, this holds display-dependent
>>>>>>>       * data FFmpeg cannot instantiate itself. Please refer to the
>>>>>>>       * FFmpeg HW accelerator documentation to know how to fill this
>>>>>>> -     * is. e.g. for VA API, this is a struct vaapi_context.
>>>>>>> +     * is. e.g. for VA API, this is a VADisplay.
>>>>>>>       * - encoding: unused
>>>>>>>       * - decoding: Set by user
>>>>>>>       */
>>>>>>
>>>>>>> @@ -2753,6 +2754,27 @@ typedef struct AVCodecContext {
>>>>>>>       * - decoding: unused
>>>>>>>       */
>>>>>>>      int slices;
>>>>>>> +
>>>>>>> +    /**
>>>>>>> +     * User-requested hardware accelerator. See HWACCEL_ID_xxx.
>>>>>>> +     * - encoding: unused
>>>>>>> +     * - decoding: Set by user
>>>>>>> +     */
>>>>>>> +    enum HWAccelID hwaccel_id;
>>>>>>
>>>>>> could you remind me why this is needed compared to the apparently more flexible
>>>>>> system with pixel formats?
>>>>>
>>>>> Before hwaccel v1, we had: PIX_FMT_<hwaccel>_<codec>. One <hwaccel> could
>>>>> easily have ~5 codecs. If you have 2 <haccel>, you get 10 pixel formats.
>>>>> Besides, you can have <codec> supported at different entrypoint. e.g.
>>>>> VLD, MoComp, IDCT. Thus, we could have combinatorial explosion.
>>>>>
>>>>> With hwaccel v1, we factored this with PIX_FMT_<hwaccel>_<entrypoint>,
>>>>> the <codec> information became irrelevant. <entrypoint> was reduced to
>>>>> typically 1 to 3. What does a pixel format with a <codec> name in it
>>>>> actually mean anyway?
>>>>>
>>>>> This is better, but this was still silly. A pixel format defines... well,
>>>>> a pixel format. If the user wants to extract the decoded pixels from the
>>>>> GPU, he requires so with AV_HWACCEL_CAP_GET_PIXELS. The pixel format of
>>>>> those needs to be expressed the usual way, e.g. avctx->pix_fmt ==
>>>>> PIX_FMT_YUV420P. How would this have been expressed if avctx->pix_fmt was
>>>>> PIX_FMT_<hwaccel>_<entrypoint>?
>>>>
>>>> "A pixel format defines... well, a pixel format."
>>>> thats a useless circular definition
>>>>
>>>> a pixel format defines the structuring of data in AVFrame.data[]
>>>> that likely differs between 2 hw accelerator APIs, so should allow selecting
>>>> hw accelerators
>>>
>>> And how do you define and select an HW accelerator that outputs NV12 or
>>> I420 frames? Or other HW accelerators that the user requested decoded
>>> pixels back to .data[0-2]?
>>
>> easy, pix_fmt = PIX_FMT_NV12
>> its fully transparent then the user app does not even need to know a hwaccel
>> is used
>
> Actually, this is not that easy. The hwaccel can require the  
> user-initialized, in a system-dependent way, hwaccel_context.

why?
this is duplicated code in each application using lavc, isnt it?


> The user
> may also want to explicitly require a specific hwaccel to use. How do you 
> express those?

like MMX/SSE/Altivec


>
> You have several cases to consider:
>
> (Kind of hwaccel)
> K1: the hwaccel decodes to opaque HW surfaces. i.e. the user can't access 
> to the decoded pixels directly but through some explicit readback  
> function.
> K2: the hwaccel decodes to SW provided memory. i.e. this memory can be  
> exposed directly to Picture.data[0-2].
>
> (Where the user wants the decoded pixels)
> L1: closest to the GPU
> L2: accessible through the CPU only (AV_HWACCEL_CAP_GET_PIXELS)
>
> (What hwaccel the user specifically wants)
> H1 .. Hn.
>
> In v1, we currently have the following code flow:
> lavc: avctx->pix_fmt = get_format()
> user: <selects hwaccel and initializes hwaccel_context>
> lavc: avctx->hwaccel = ff_find_hwaccel(avctx->codec->id, avctx->pix_fmt);
>
> This scenario doesn't work for K2 hwaccels because an hwaccel_context  
> needs to be provided and/or user wants to choose H1..Hn specifically  
> through get_format().
>
>>> Would you introduce another level with the actual pixel format. e.g.
>>> PIX_FMT_YUV420P_<hwaccel>_<entrypoint>?
>>>
>>>>> Besides, from an implementation point of view (in libavcodec), the pre-v1
>>>>> and v1 way of using pixel formats becomes tedious to handle.
>>>>>
>>>>> We currently have hooks called through:
>>>>>
>>>>> if (avctx->hwaccel)
>>>>>   avctx->hwaccel->decode_slice(...);
>>>>>
>>>>> this needs to become:
>>>>> if (avctx->hwaccel && <is entrypoint VLD>)
>>>>>   avctx->hwaccel->decode_slice(...);
>>>>>
>>>>> so that other entry-points could be managed easily.
>>>>>
>>>>> With the v1 way, <is entrypoint == VLD> becomes a big evil switch:
>>>>> return avctx->pix_fmt == PIX_FMT_hwaccel1_VLD ||?avctx->pix_fmt ==
>>>>> PIX_FMT_hwaccel2_VLD || avcxtx->pix_fmt == PIX_FMT_hwaccel3_VLD, etc.
>>>>>
>>>>> Horrible, isn't it?
>>>>
>>>> pix_fmt_descriptor[pix_fmt].flags & FLAG_VLD
>>>> doesnt look horrible to me
>>>
>>> if (avctx->hwaccel && avctx->hwaccel->entrypoint == HWACCEL_ENTRYPOINT_VLD)
>>> looks better, less indirections, and keeps things on a single line. :)
>>
>> if you put entrypoint in hwaccel you need a seperate
>> AVHWAccel for each entrypoint. and then you could just do
>> if(avctx->hwaccel && avctx->hwaccel->decode_slice)
>
> I'd prefer lavc to ensure the implementor did not do anything insane. If  
> we make all hooks optional without proper check or notification of error  
> (segfault or assert()), the implementor could miss some functions to add.
>
> e.g. VLD requires decode_slice(), IDCT & MoComp require decode_mb(). I  
> would like to ensure that in VLD mode, only decode_slice() is 
> implemented, and in IDCT & MoComp mode, only decode_mb() is implemented.
>

> Hmmm, we could check this in the ff_init_hwaccel() though.

agree and yes


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101110/064a3724/attachment.pgp>



More information about the ffmpeg-devel mailing list