[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

Oliver Collyer ovcollyer at mac.com
Fri Sep 2 12:16:24 EEST 2016


> On 2 Sep 2016, at 12:12, Timo Rothenpieler <timo at rothenpieler.org> wrote:
> 
>> Just sticking my head above the parapet, but shouldn’t things like...
>> 
>>> +            for (x = 0; x < c->srcW / 2; x++) {
>>> +                dstUV[x*2  ] = src[1][x] << 6;
>>> +                dstUV[x*2+1] = src[2][x] << 6;
>>> +            }
>> 
>> …be more efficiently written as...
>> 
>> uint16_t* tdstUV = dstUV;
>> uint16_t* tsrc1 = src[1];
>> uint16_t* tsrc2 = src[2];
>> for (x = c->srcW / 2; x > 0; x--) {
>>    *tdstUV++ = *tsrc1++ << 6;
>>    *tdstUV++ = *tsrc2++ << 6;
>> }
>> 
>> …or is that really old-school and a modern compiler does all that when optimising?
>> 
>> Or is readability considered more important than marginal gains in performance?
>> 
>> Oliver (time travelling from the 1980s)
> 
> You would still have to add the remaining stride.
> The linesize is usually larger than the width, so each line is properly
> aligned.
> 
> So with your code, you'd still need something like
> 
> dstUV += dstStride[1] / 2 - 2 * x;
> src[2] += srcStride[1] / 2 - x;
> src[2] += srcStride[1] / 2 - x;
> 
> after it.

No, the lines after it remain unchanged - only the temporary variables are looping along the x.

src[1] += srcStride[1] / 2;
src[2] += srcStride[2] / 2;
dstUV += dstStride[1] / 2;

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel



More information about the ffmpeg-devel mailing list