[FFmpeg-trac] #10514(undetermined:new): swscale alignment requirements (and issues) in YUV420P to RGB conversions

Thu Aug 10 17:46:07 EEST 2023

#10514: swscale alignment requirements (and issues) in YUV420P to RGB conversions
-------------------------------------+-------------------------------------
             Reporter:  vkonovets    |                    Owner:  (none)
                 Type:  defect       |                   Status:  new
             Priority:  normal       |                Component:
                                     |  undetermined
              Version:  4.3.6        |               Resolution:
             Keywords:               |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
Description changed by vkonovets:

Old description:

> I'm working on an API which wraps around `swscale` to do conversions on
> memory not strictly allocated for `AVFrame` (i.e my own allocations).
> For some image dimensions I noticed visual artifacts and memory
> corruption.  Although there are no alignment requirements for rows stated
> in the documentation, reading from Stack Overflow and various tickets
> here I understood that there is a minimum alignment requirement for
> allocations and padding padding at the end of the row (although this does
> not solve the issue).
>
> https://trac.ffmpeg.org/ticket/9331 points out what I believe is the
> issue, however that ticket was dismissed because the memory allocation
> was not aligned (I ensure it is in my sample).  To quote:
>
> //the assembly code accelerates the process of convert yuv to rgb by
> using sse2 register... when the width of video frame is not a multiply of
> 16, extra wrong yuv pixel is read from the memory and writing extra rgb
> value to the memory allocated for the output image and cause the invalid
> writing//.
>
> From that ticket and `get_video_buffer()` source code
> (https://ffmpeg.org/doxygen/4.3/frame_8c_source.html#l00109), which
> allocates memory for `AVFrame`, it follows that not just the row that
> needs to be padded, but also the row //dimension// needs to be padded.
>
> I attach to this ticket sample code which takes width as an argument
> which you can use to observe the effects:
> {{{#!cpp
> #include <stdlib.h>
> extern "C"
> {
> #include <libswscale/swscale.h>
> }
>
> bool absDiffTol(unsigned char a, unsigned char b, int tol)
> {
>     return abs((int)a - (int)b) <= tol;
> }
>
> int main(int argc, char** argv)
> {
>     int width = atoi(argv[1]);
>     int height = 8;
>     int alignment = 64;
>
>     // YUV420P image
>     int srcLinesize[8] = {width - width % alignment + alignment,
>                           width / 2 - (width / 2) % alignment +
> alignment,
>                           width / 2 - (width / 2) % alignment +
> alignment,
>                           0, 0, 0, 0, 0};
>
>     // RGB image
>     int dstLinesize[8] = {width * 3 - (width * 3) % alignment +
> alignment,
>                           0, 0, 0, 0, 0, 0, 0};
>

>     unsigned char* srcData[8];
>     srcData[0] = (unsigned char*)aligned_alloc(alignment, srcLinesize[0]
> * height);
>     srcData[1] = (unsigned char*)aligned_alloc(alignment, srcLinesize[1]
> * height);
>     srcData[2] = (unsigned char*)aligned_alloc(alignment, srcLinesize[2]
> * height);
>
>     unsigned char* dstData[8];
>     dstData[0] = (unsigned char*)aligned_alloc(alignment, dstLinesize[0]
> * height);
>
>     // Fill YUV420 with green
>     for (int y(0); y < height; y++)
>     {
>         for(int x(0); x < width; x++)
>         {
>             srcData[0][y * srcLinesize[0] + x] = 145;
>             srcData[1][(y / 2) * srcLinesize[1] + x / 2] = 54;
>             srcData[2][(y / 2) * srcLinesize[2] + x / 2] = 34;
>         }
>     }
>
>     SwsContext* ctx = sws_getContext(width, height, AV_PIX_FMT_YUV420P,
>                                      width, height, AV_PIX_FMT_RGB24,
>                                      0, nullptr, nullptr, nullptr);
>     if (!ctx)
>     {
>         printf("Could not allocate context\n");
>         return -1;
>     }
>
>     if(sws_scale(ctx, srcData, srcLinesize, 0, height, dstData,
> dstLinesize) != height)
>     {
>         printf("sws_scale returned with error.\n");
>         return -1;
>     }
>
>     // Check RGB is in fact green
>     for (int y(0); y < height; y++)
>     {
>         for (int x(0); x < width; x++)
>         {
>             unsigned char r = dstData[0][y * dstLinesize[0] + x*3];
>             unsigned char g = dstData[0][y * dstLinesize[0] + x*3 + 1];
>             unsigned char b = dstData[0][y * dstLinesize[0] + x*3 + 2];
>             if (!absDiffTol(r, 0, 1) || !absDiffTol(g, 255, 1) ||
> !absDiffTol(b, 0, 1))
>             {
>                 printf("Pixel at (%d, %d) has unexpected value [%d, %d,
> %d]\n", x, y, r, g, b);
>             }
>         }
>     }
>
>     return 0;
> }
>
> }}}
>
> The sample program was linked against ffmpeg version:
> {{{
> ffmpeg version 4.3.6-0+deb11u1 Copyright (c) 2000-2023 the FFmpeg
> developers
>   built with gcc 10 (Debian 10.2.1-6)
> }}}
>
> You can run
> {{{
> for ((i=16;i<=512;i+=2)); do valgrind --tool=memcheck ./sample $i |& tee
> out$i.log; done
> }}}
> and `grep "Invalid " ./*` in the generated logfiles to observe which
> widths produce invalid reads and writes.
>
> I would also appreciate the following clarifications:
> - `get_video_buffer()` pads the height during allocation.  Why is this
> necessary?  Is this to do with `swscale` or another component of `libav`?
> - Are there padding requirements for grayscale, grayscale+alpha, RGBA
> formats?
> - Is there some source I could refer to for alignment requirements on
> memory (forum posts, tickets, documentation)?

New description:

 I'm working on an API which wraps around `swscale` to do conversions on
 memory not strictly allocated for `AVFrame` (i.e my own allocations).  For
 some image dimensions I noticed visual artifacts and memory corruption.
 Although there are no alignment requirements for rows stated in the
 documentation, reading from Stack Overflow and various tickets here I
 understood that there is a minimum alignment requirement for allocations
 and padding at the end of the row (although this does not solve the
 issue).

 https://trac.ffmpeg.org/ticket/9331 points out what I believe is the
 issue, however that ticket was dismissed because the memory allocation was
 not aligned (I ensure it is in my sample).  To quote:

 //the assembly code accelerates the process of convert yuv to rgb by using
 sse2 register... when the width of video frame is not a multiply of 16,
 extra wrong yuv pixel is read from the memory and writing extra rgb value
 to the memory allocated for the output image and cause the invalid
 writing//.

 From that ticket and `get_video_buffer()` source code
 (https://ffmpeg.org/doxygen/4.3/frame_8c_source.html#l00109), which
 allocates memory for `AVFrame`, it follows that not just the row that
 needs to be padded, but also the row //dimension// needs to be padded.

 I attach to this ticket sample code which takes width as an argument which
 you can use to observe the effects:
 {{{#!cpp
 #include <stdlib.h>
 extern "C"
 {
 #include <libswscale/swscale.h>
 }

 bool absDiffTol(unsigned char a, unsigned char b, int tol)
 {
     return abs((int)a - (int)b) <= tol;
 }

 int main(int argc, char** argv)
 {
     int width = atoi(argv[1]);
     int height = 8;
     int alignment = 64;

     // YUV420P image
     int srcLinesize[8] = {width - width % alignment + alignment,
                           width / 2 - (width / 2) % alignment + alignment,
                           width / 2 - (width / 2) % alignment + alignment,
                           0, 0, 0, 0, 0};

     // RGB image
     int dstLinesize[8] = {width * 3 - (width * 3) % alignment + alignment,
                           0, 0, 0, 0, 0, 0, 0};

     unsigned char* srcData[8];
     srcData[0] = (unsigned char*)aligned_alloc(alignment, srcLinesize[0] *
 height);
     srcData[1] = (unsigned char*)aligned_alloc(alignment, srcLinesize[1] *
 height);
     srcData[2] = (unsigned char*)aligned_alloc(alignment, srcLinesize[2] *
 height);

     unsigned char* dstData[8];
     dstData[0] = (unsigned char*)aligned_alloc(alignment, dstLinesize[0] *
 height);

     // Fill YUV420 with green
     for (int y(0); y < height; y++)
     {
         for(int x(0); x < width; x++)
         {
             srcData[0][y * srcLinesize[0] + x] = 145;
             srcData[1][(y / 2) * srcLinesize[1] + x / 2] = 54;
             srcData[2][(y / 2) * srcLinesize[2] + x / 2] = 34;
         }
     }

     SwsContext* ctx = sws_getContext(width, height, AV_PIX_FMT_YUV420P,
                                      width, height, AV_PIX_FMT_RGB24,
                                      0, nullptr, nullptr, nullptr);
     if (!ctx)
     {
         printf("Could not allocate context\n");
         return -1;
     }

     if(sws_scale(ctx, srcData, srcLinesize, 0, height, dstData,
 dstLinesize) != height)
     {
         printf("sws_scale returned with error.\n");
         return -1;
     }

     // Check RGB is in fact green
     for (int y(0); y < height; y++)
     {
         for (int x(0); x < width; x++)
         {
             unsigned char r = dstData[0][y * dstLinesize[0] + x*3];
             unsigned char g = dstData[0][y * dstLinesize[0] + x*3 + 1];
             unsigned char b = dstData[0][y * dstLinesize[0] + x*3 + 2];
             if (!absDiffTol(r, 0, 1) || !absDiffTol(g, 255, 1) ||
 !absDiffTol(b, 0, 1))
             {
                 printf("Pixel at (%d, %d) has unexpected value [%d, %d,
 %d]\n", x, y, r, g, b);
             }
         }
     }

     return 0;
 }

 }}}

 The sample program was linked against ffmpeg version:
 {{{
 ffmpeg version 4.3.6-0+deb11u1 Copyright (c) 2000-2023 the FFmpeg
 developers
   built with gcc 10 (Debian 10.2.1-6)
 }}}

 You can run
 {{{
 for ((i=16;i<=512;i+=2)); do valgrind --tool=memcheck ./sample $i |& tee
 out$i.log; done
 }}}
 and `grep "Invalid " ./*` in the generated logfiles to observe which
 widths produce invalid reads and writes.

 I would also appreciate the following clarifications:
 - `get_video_buffer()` pads the height during allocation.  Why is this
 necessary?  Is this to do with `swscale` or another component of `libav`?
 - Are there padding requirements for grayscale, grayscale+alpha, RGBA
 formats?
 - Is there some source I could refer to for alignment requirements on
 memory (forum posts, tickets, documentation) for any generic input/output
 pixel format?

--
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/10514#comment:1>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker