[FFmpeg-devel] [PATCH 1/5] avutil: add pixelutils API
James Almer
jamrial at gmail.com
Sat Aug 2 23:25:01 CEST 2014
On 02/08/14 6:13 PM, Clément Bœsch wrote:
> On Sat, Aug 02, 2014 at 04:29:39PM -0300, James Almer wrote:
>> On 02/08/14 3:20 PM, Clément Bœsch wrote:
>>> + psrlq m0, m6, 32
>>> + paddw m6, m0
>>> + psrlq m0, m6, 16
>>> + paddw m6, m0
>>> + movd eax, m6
>>> + movzx eax, ax
>>
>> You could use the HADDW macro here.
>>
>
> error: undefined symbol `pw_1' (first use)
>
> sounds somehow constraining. I'll keep my version until you benchmark to
> prove me HADDW is faster on an old MMX cpu ;)
I have no idea if it's faster, nor a way to test that for that matter.
It's four instructions instead of six, but pmaddwd + memory operand is probably
not fast enough on old cpus.
>
>>> +;-------------------------------------------------------------------------------
>>> +; int ff_pixelutils_sad_8x8_mmxext(const uint8_t *src1, ptrdiff_t stride1,
>>> +; const uint8_t *src2, ptrdiff_t stride2);
>>> +;-------------------------------------------------------------------------------
>>> +INIT_MMX mmxext
>>> +cglobal pixelutils_sad_8x8, 4,4,0, src1, stride1, src2, stride2
>>> + pxor m2, m2
>>> +%rep 4
>>> + mova m0, [src1q]
>>> + mova m1, [src1q + stride1q]
>>> + psadbw m0, [src2q]
>>> + psadbw m1, [src2q + stride2q]
>>> + paddw m2, m0
>>> + paddw m2, m1
>>> + lea src1q, [src1q + 2*stride1q]
>>> + lea src2q, [src2q + 2*stride2q]
>>> +%endrep
>>> + movd eax, m2
>>> + RET
>>
>> Adding sad16x16 mmxext should be a matter of using add instead of lea, changing
>> the %rep amount, and using 8 instead of stride[12]q for the mova and psadbw.
>>
>
> Yeah right, added. Thanks.
>
>>> --- /dev/null
>>> +++ b/libavutil/x86/pixelutils.h
>>> @@ -0,0 +1,26 @@
>>> +/*
>>> + * This file is part of FFmpeg.
>>> + *
>>> + * FFmpeg is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU Lesser General Public
>>> + * License as published by the Free Software Foundation; either
>>> + * version 2.1 of the License, or (at your option) any later version.
>>> + *
>>> + * FFmpeg is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>>> + * Lesser General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU Lesser General Public
>>> + * License along with FFmpeg; if not, write to the Free Software
>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>>> + */
>>> +
>>> +#ifndef AVUTIL_X86_PIXELUTILS_H
>>> +#define AVUTIL_X86_PIXELUTILS_H
>>> +
>>> +#include "libavutil/pixelutils.h"
>>> +
>>> +void ff_pixelutils_init_x86(AVPixelUtils *s);
>>
>> This prototype should be in libavutil/pixelutils.h
>> No need to make a whole new header just for it.
>>
>
> No, libavutil/pixelutils.h is public, I don't want to have private
> prototypes in it.
Right, forgot it was public. I had lavc dsp stuff in mind when i said that.
>
>> Maybe you could add a quick test for these functions? Look at lavc/motion-test.c and
>> lavu/float-dsp.c
>
> Added.
>
> I'll resubmit a patchset in a moment.
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list