[FFmpeg-cvslog] r14897 - trunk/libavcodec/acelp_filters.c
Fri Aug 22 22:31:16 CEST 2008
Michael Niedermayer <michaelni at gmx.at> writes:
> On Fri, Aug 22, 2008 at 02:13:23PM +0100, M?ns Rullg?rd wrote:
>> Michael Niedermayer wrote:
>> > On Fri, Aug 22, 2008 at 12:42:04AM +0100, M?ns Rullg?rd wrote:
>> >> michael <subversion at mplayerhq.hu> writes:
>> >> > Author: michael
>> >> > Date: Fri Aug 22 01:25:41 2008
>> >> > New Revision: 14897
>> >> >
>> >> > Log:
>> >> > Remove mathops.h dependancy.
>> >> The purpose of mathops.h is to allow CPU-specific instructions to be
>> >> used for common operations. Not using it when possible is similar to
>> >> not using dsputil. Is there a specific reason the optimised macros
>> >> are unsuitable in this file?
>> > Several points
>> > the first line can be simplified to
>> > tmp = (hpf_f* 3959LL)>>11;
>> > but the shift is fixed per file when mathops.h is used
>> We could easily add a shift argument to the macro.
> Yes i definitly agree with this ...
> the way MULL is currently with the shift at the top of the file is not
> particularely readable ...
OK, I'll put that on my list of things to do when I have time. It's a
>> > vladimirs comments indicated that the hpf variables need 26 bits
>> > in which case the code can be simplified to
>> > tmp = ( 31*hpf_f + ((hpf_f*-9)>>7))>>4
>> > tmp += (-15*hpf_f + ((hpf_f*13)>>9))>>4;
>> > which avoids the 64bit ops ...
>> Is it faster though? It still has multiplications.
> faster on "what" ?
Anything. The answer, as you say, can of course vary.
> on normal desktop systems i suspect the whole would be fastest when
> done in floats.
> On non desktop hardware the 64 bit multiply and shift likely will be
> expensive. also *31, *15 and *9 are just 1 shift and 1 add/sub each and
> at least on x86 gcc is pretty good at spliting such multiplies into
> shift/add sequences.
True. That function is looks SIMDable anyway, so if anything spends a
lot of time there, there are better ways to optimise it.
> anyway, i dont mind at all to revert my commit and put the MULL back.
No, don't. I was just curious about the reason for this change.
mans at mansr.com
More information about the ffmpeg-cvslog