[FFmpeg-devel] [PATCH] avcodec/libmp3lame: properly handle unaligned frame data

Sun Apr 30 19:55:02 EEST 2017

On Thu, Apr 27, 2017 at 3:51 PM, Nicolas George <george at nsup.org> wrote:
> L'octidi 8 floréal, an CCXXV, Michael Niedermayer a écrit :
>> I agree
>> in fact i added such a flag in 2011 (4d34b6c1a1254850e39a36f08f4d2730092a54db)
>> within the API of that time to avfilter
>
> It was not a bad idea, but it should not be limited to filters. A few
> comments.
>
> * First, the framequeue framework does not produce unaligned code.
> According to the C standard, the data it handles stay aligned. The
> alignment problems come from non-standard requirements by special
> processor features used by some filters and codecs, but not all.
>
> * That means a lot of the most useful codecs and filters will suffer
> from it, but not all. For many tasks, the alignment is just fine, and
> the extra copy would be wasteful.
>
> * The alignment requirements increase. Before AVX, it was up to 16, now
> it can be 32, and I have no doubt future processor will at some point
> require 64 or 128. But realigning buffers used with SSE to 32 would be
> wasteful too. Thus, we do not require a flag but a full integer.
>
> * The code that does the actual work of realigning a buffer should
> available as a stand-alone API, to be used by applications that work at
> low-level. I suppose something like that would be in order:
>
>         int av_frame_realign(AVFrame *frame, unsigned align);
>
> Or maybe:
>
>         int av_frame_realign(AVFrame *frame, unsigned align,
>                              AVBufferAllocator *alloc);
>
> where AVBufferAllocator is there to allocate possibly hardware or mmaped
> buffers.
>
> * It is another argument for my leitmotiv that filters and codecs are
> actually the same and should be merged API-wise.
>
> * It would be better to have the API just work for everything rather
> than documenting the alignment needs.
>
> As for the actual implementation, I see a lot of different approaches:
>
> - have the framework realing the frame before submitting it to the
>   filters and codecs: costly in malloc() and memcpy() but simple;
>
> - have each filter or codec call av_frame_realign() as needed; it may
>   seem less elegant than the previous proposal, but it may prove a
>   better choice in the light of what follows;
>
> - have each filter or codec copy the unaligned data into a buffer
>   allocated once and for all or on the stack, possibly by small chunks:
>   less costly in malloc() and refcounting overhead, and possibly better
>   cache-locality, but more complex code;
>
> - run the plain C version of the code on unaligned data rather than the
>   vectorized version, or the less-vectorized version (SSE vs AVX) on
>   insufficiently aligned data.
>
> Since all this boils down to a matter of performance and is related to
> the core task of FFmpeg, I think the choice between the various options
> should be done on a case-by-case basis using real benchmarks.

Are you working on these? Because currently I'm not.

Thank's