[FFmpeg-devel] [PATCH] libavutil/hwcontext_qsv: Fix bug that the QSV encoded frames'width and height are 32-aligned

Thu Jan 5 03:09:07 EET 2017

On 04/01/17 09:49, Huang, Zhengxu wrote:
> 在 2017/1/3 20:40, Mark Thompson 写道:
>> On 03/01/17 06:35, Huang, Zhengxu wrote:
>>>  From 8b1bcc0634f6ce36acfbd2bfdd26690a6323d09c Mon Sep 17 00:00:00 2001
>>> From: Zhengxu <zhengxu.maxwell at gmail.com>
>>> Date: Fri, 16 Dec 2016 11:10:34 +0800
>>> Subject: [PATCH] libavutil/hwcontext_qsv: Fix bug that the QSV encoded frames'
>>>   width and height are 32-aligned.
>>>
>>> Description:
>>> If an input is of 1280x720, the encoded stream created by command below is of 1280x736:
>>> ffmpeg -hwaccel qsv -c:v h264_qsv -i test.h264 -c:v h264_qsv out.h264
>>>
>>> Reason:
>>> When creating a AVQSVFramesContext, width and height shouldn't be aligned, or the mfxSurfaces'
>>>   cropW and cropH will be wrong.
>>>
>>> Fix:
>>> User should configure AVQSVFramesContext with origin width and height and AVFramesContext will
>>>   align the width and height when being initiallized.
>>>
>>> Signed-off-by: ChaoX A Liu <chaox.a.liu at gmail.com>
>>> Signed-off-by: Huang, Zhengxu <zhengxu.maxwell at gmail.com>
>>> Signed-off-by: Andrew, Zhang <huazh407 at gmail.com>
>>> ---
>>>   ffmpeg_qsv.c              | 8 ++++----
>>>   libavutil/hwcontext_qsv.c | 8 ++++++--
>>>   2 files changed, 10 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/ffmpeg_qsv.c b/ffmpeg_qsv.c
>>> index 68ff5bd..aab7375 100644
>>> --- a/ffmpeg_qsv.c
>>> +++ b/ffmpeg_qsv.c
>>> @@ -76,8 +76,8 @@ int qsv_init(AVCodecContext *s)
>>>       frames_ctx   = (AVHWFramesContext*)ist->hw_frames_ctx->data;
>>>       frames_hwctx = frames_ctx->hwctx;
>>>   -    frames_ctx->width             = FFALIGN(s->coded_width,  32);
>>> -    frames_ctx->height            = FFALIGN(s->coded_height, 32);
>>> +    frames_ctx->width             = s->coded_width;
>>> +    frames_ctx->height            = s->coded_height;
>>>       frames_ctx->format            = AV_PIX_FMT_QSV;
>>>       frames_ctx->sw_format         = s->sw_pix_fmt;
>>>       frames_ctx->initial_pool_size = 64;
>>> @@ -152,8 +152,8 @@ int qsv_transcode_init(OutputStream *ost)
>>>       encode_frames = (AVHWFramesContext*)encode_frames_ref->data;
>>>       qsv_frames = encode_frames->hwctx;
>>>   -    encode_frames->width     = FFALIGN(ist->resample_width,  32);
>>> -    encode_frames->height    = FFALIGN(ist->resample_height, 32);
>>> +    encode_frames->width     = ist->resample_width;
>>> +    encode_frames->height    = ist->resample_height;
>>>       encode_frames->format    = AV_PIX_FMT_QSV;
>>>       encode_frames->sw_format = AV_PIX_FMT_NV12;
>>>       encode_frames->initial_pool_size = 1;
>>> diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontext_qsv.c
>>> index 03244a6..2dc9aca 100644
>>> --- a/libavutil/hwcontext_qsv.c
>>> +++ b/libavutil/hwcontext_qsv.c
>>> @@ -308,9 +308,13 @@ static int qsv_init_pool(AVHWFramesContext *ctx, uint32_t fourcc)
>>>               surf->Info.ChromaFormat   = MFX_CHROMAFORMAT_YUV444;
>>>             surf->Info.FourCC         = fourcc;
>>> -        surf->Info.Width          = ctx->width;
>>> +        /*
>>> +         * WxH being aligned with 32x32 is needed by MSDK.
>>> +         * CropW and CropH are the real size of the frame.
>>> +         */
>>> +        surf->Info.Width          = FFALIGN(ctx->width, 32);
>>>           surf->Info.CropW          = ctx->width;
>>> -        surf->Info.Height         = ctx->height;
>>> +        surf->Info.Height         = FFALIGN(ctx->height, 32);
>>>           surf->Info.CropH          = ctx->height;
>>>           surf->Info.FrameRateExtN  = 25;
>>>           surf->Info.FrameRateExtD  = 1;
>>> -- 
>>> 1.8.3.1
>>>
>> This seems wrong to me - the hwcontext code is only dealing in surfaces, and should not be interested in the actual dimensions of the frame on each surface (that is a per-frame property anyway, since it need not be the same for all frames in a context).
>>
>> Is it possible to instead fix this case by adjusting the cropping parameters after copying the FrameInfo at libavcodec/qsvenc.c:378?  Maybe this (not tested at all):
>>
>> diff --git a/libavcodec/qsvenc.c b/libavcodec/qsvenc.c
>> index ac443c1a26..32e2a4ed13 100644
>> --- a/libavcodec/qsvenc.c
>> +++ b/libavcodec/qsvenc.c
>> @@ -376,6 +376,8 @@ static int init_video_param(AVCodecContext *avctx, QSVEncContext *q)
>>           AVHWFramesContext *frames_ctx = (AVHWFramesContext*)avctx->hw_frames_ctx->data;
>>           AVQSVFramesContext *frames_hwctx = frames_ctx->hwctx;
>>           q->param.mfx.FrameInfo = frames_hwctx->surfaces[0].Info;
>> +        q->param.mfx.FrameInfo.CropW = avctx->width;
>> +        q->param.mfx.FrameInfo.CropH = avctx->height;
>>       } else {
>>           q->param.mfx.FrameInfo.FourCC         = MFX_FOURCC_NV12;
>>           q->param.mfx.FrameInfo.Width          = FFALIGN(avctx->width, q->width_align);
>>
> 
> Hi,
> 
> 
> If fixing this issue by adjusting the cropping parameters after copying the FrameInfo, maybe it will still have some problem.
> 
> This fix will only modify the encoder's surface information but other modules' surface information(CropW and CropH of the FrameInfo) also needs to be modified. So many more codes need to modify.
> 
> In my opinion fixing this case in the hwcontext_qsv.c maybe much better . And the surface information do need align when doing allocator.
> 
> What's your opinion, or some other better suggestion ,thanks.

The hwcontext information describes the hardware surfaces, not the frames contained within them.  The frame on the surface need not occupy all of it, and we use AVFrame.width|height to store how big it actually is, which can then be used by all components interacting with the frame, whether qsv-specific or generic .  Similarly, AVCodecContext.width|height and AVFilterLink.w|h carry that size information for sequences of frames, and can change while using the same underlying AVHWFramesContext if the frames still fit on those surfaces.  I regard it as slightly unfortunate that the frame size ends up in the surface information as cropping values because of the content of the mfxFrameSurface1 structure (compare to the DXVA2 or VAAPI hwcontext implementations, which store the frame size information only in the frame).

Looking at libav, this specific issue is fixed by <https://git.libav.org/?p=libav.git;a=commit;h=d9ec3c60143babe1bb77c268e1d5547d15acd69b>, which I think is the correct way to do it.  Would a merge of that be sufficient for you?  (It doesn't look like it has any dependencies, but I try to keep the merges in order and 10-bit support is queued before it (which I don't currently have any test setup for).)

Thanks,

- Mark