[FFmpeg-devel] [PATCH] Use av_clip_uint8 in swscale.
Mon Aug 17 20:41:56 CEST 2009
Frank Barchard <fbarchard at google.com> writes:
> 2009/8/17 M?ns Rullg?rd <mans at mansr.com>
>> Frank Barchard <fbarchard at google.com> writes:
>> > The table method works well on all platforms... better than if statements
>> > anyway.
>> Depends on the range of inputs. If you want to allow the full 32-bit
>> range, well... Even a smaller range could put significant pressure on
>> the cache.
> In practice you know the range of values.
And you might know it's unbounded.
> If you combined 3 bytes, its 768 values.
> I know if statements are increasingly efficient, and memory less efficient,
> but the original code had 4 to 6 instructions and potentially 2 branches
> taken per clipped value.
> av_clip_uint8() can be optimized to a single instruction on most CPU's
Yes, on those with dedicated clip instructions. Others will need
several instructions to support the full 32-bit range. Even if the
range is known to be smaller, a table lookup can be slower than a few
compares and conditional instructions, and it poisons the cache
>> > On x86, there is cmov, but in the above code it would take cmp,
>> > cmov, cmp, cmov to do each value, whereas the table method takes
>> > one mov instruction.
>> You're forgetting the address calculation.
> movzx eax,cliptbl[eax*4]
Now you're back at the 4GB table. And where did the value of
"cliptbl" come from? It would have to be loaded from somewhere.
mans at mansr.com
More information about the ffmpeg-devel