[FFmpeg-devel] [PATCH 1/3] avutil/imgutils: Optimize writing 4 bytes in memset_bytes()

Marton Balint cus at passwd.hu
Wed Jan 16 21:00:22 EET 2019



On Tue, 15 Jan 2019, Michael Niedermayer wrote:

> On Sun, Dec 30, 2018 at 07:15:49PM +0100, Marton Balint wrote:
>>
>>
>> On Fri, 28 Dec 2018, Michael Niedermayer wrote:
>>
>>> On Wed, Dec 26, 2018 at 10:16:47PM +0100, Marton Balint wrote:
>>>>
>>>>
>>>> On Wed, 26 Dec 2018, Paul B Mahol wrote:
>>>>
>>>>> On 12/26/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
>>>>>> On Wed, Dec 26, 2018 at 04:32:17PM +0100, Paul B Mahol wrote:
>>>>>>> On 12/25/18, Michael Niedermayer <michael at niedermayer.cc> wrote:
>>>>>>>> Fixes: Timeout
>>>>>>>> Fixes:
>>>>>>>> 11502/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>>>>>>>> Before: Executed
>>>>>>>> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>>>>>>>> in 11294 ms
>>>>>>>> After : Executed
>>>>>>>> clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WCMV_fuzzer-5664893810769920
>>>>>>>> in 4249 ms
>>>>>>>>
>>>>>>>> Found-by: continuous fuzzing process
>>>>>>>> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
>>>>>>>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
>>>>>>>> ---
>>>>>>>> libavutil/imgutils.c | 6 ++++++
>>>>>>>> 1 file changed, 6 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/libavutil/imgutils.c b/libavutil/imgutils.c
>>>>>>>> index 4938a7ef67..cc38f1e878 100644
>>>>>>>> --- a/libavutil/imgutils.c
>>>>>>>> +++ b/libavutil/imgutils.c
>>>>>>>> @@ -529,6 +529,12 @@ static void memset_bytes(uint8_t *dst, size_t
>>>>>>>> dst_size,
>>>>>>>> uint8_t *clear,
>>>>>>>>         }
>>>>>>>>     } else if (clear_size == 4) {
>>>>>>>>         uint32_t val = AV_RN32(clear);
>>>>>>>> +        uint64_t val8 = val * 0x100000001ULL;
>>>>>>>> +        for (; dst_size >= 32; dst_size -= 32) {
>>>>>>>> +            AV_WN64(dst   , val8); AV_WN64(dst+ 8, val8);
>>>>>>>> +            AV_WN64(dst+16, val8); AV_WN64(dst+24, val8);
>>>>>>>> +            dst += 32;
>>>>>>>> +        }
>>>>>>>>         for (; dst_size >= 4; dst_size -= 4) {
>>>>>>>>             AV_WN32(dst, val);
>>>>>>>>             dst += 4;
>>>>>>>> --
>>>>>>>> 2.20.1
>>>>>>>>
>>>>>>>
>>>>>>> NAK, implement special memset function instead.
>>>>>>
>>>>>> I can move the added loop into a seperate function, if thats what you
>>>>>> suggest ?
>>>>>
>>>>> No, don't do that.
>>>>>
>>>>>> All the code is already in a "special" memset though, this is
>>>>>> memset_bytes()
>>>>>>
>>>>>
>>>>> I guess function is less useful if its static. So any duplicate should
>>>>> be avoided in codebase.
>>>>
>>>> Isn't av_memcpy_backptr does almost exactly what is needed here? That can
>>>> also be optimized further if needed.
>>>
>>> av_memcpy_backptr() copies data with overlap, its more like a recursive
>>> memmove().
>>
>> So? As far as I see the memset_bytes function in imgutils.c can be replaced
>> with this:
>>
>>     if (clear_size > dst_size)
>>         clear_size = dst_size;
>>     memcpy(dst, clear, clear_size);
>>     av_memcpy_backptr(dst + clear_size, clear_size, dst_size - clear_size);
>>
>> I am not against an av_memset_bytes API addition, but I believe it should
>> share code with av_memcpy_backptr to avoid duplication.
>
> ive implemented this, it does not seem to be really faster in the testcase

I guess it is not faster because you have not applied your 
original optimalization to fill32 in libavutil/mem.c. Could you compare 
speed after optimizing that the same way your original patch did it with 
imgutils memset_bytes?

Thanks,
Marton


More information about the ffmpeg-devel mailing list