[FFmpeg-devel] [PATCH] ALS: Solve Issue 1657

Michael Niedermayer michaelni
Wed Jan 6 20:46:04 CET 2010


On Wed, Jan 06, 2010 at 01:21:10AM +0100, Thilo Borgmann wrote:
> Am 05.01.10 23:38, schrieb Michael Niedermayer:
> > On Tue, Jan 05, 2010 at 02:58:56PM +0100, Thilo Borgmann wrote:
> >> Am 05.01.10 12:11, schrieb Michael Niedermayer:
> >>> On Tue, Jan 05, 2010 at 03:59:41AM +0100, Thilo Borgmann wrote:
> >>>> Am 05.01.10 02:43, schrieb Michael Niedermayer:
> >>>>> On Tue, Jan 05, 2010 at 02:23:09AM +0100, Thilo Borgmann wrote:
> >>>>>> Am 05.01.10 01:43, schrieb Michael Niedermayer:
> >>>>>>> On Tue, Jan 05, 2010 at 12:34:53AM +0100, Thilo Borgmann wrote:
> >>>>>>>> Am 05.01.10 00:30, schrieb Thilo Borgmann:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> issue 1657 seems to be caused by negative indices used in [].
> >>>>>>>>> See: http://roundup.ffmpeg.org/roundup/ffmpeg/issue1657
> >>>>>>>>>
> >>>>>>>>> Using *() resolves this issue.
> >>>>>>>>>
> >>>>>>>>> Tested with gcc 4.0 on MacOS 10.6. There were other versions/compilers
> >>>>>>>>> mentioned in roundup, maybe these could be tested by someone (you)?
> >>>>>>>>>
> >>>>>>>>> I'm sorry, my svn still seems to be broken and produces unusable patches
> >>>>>>>>> (%ld...). Nevertheless I can apply them if the workaround is ok.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Some artifacts left in als_data.h. Ignore the old patch, updated patch
> >>>>>>>> attached.
> >>>>>>>>
> >>>>>>>> -Thilo
> >>>>>>>
> >>>>>>>>  alsdec.c |    4 ++--
> >>>>>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>>>>> 987821d84540420efa6f2e67be17094074e638f8  als_issue1657.rev1.patch
> >>>>>>>> Index: libavcodec/alsdec.c
> >>>>>>>> ===================================================================
> >>>>>>>> --- libavcodec/alsdec.c	(Revision 21025)
> >>>>>>>> +++ libavcodec/alsdec.c	(Arbeitskopie)
> >>>>>>>> @@ -%ld,%ld +%ld,%ld @@
> >>>>>>>>              y = 1 << 19;
> >>>>>>>>  
> >>>>>>>>              for (sb = 0; sb < smp; sb++)
> >>>>>>>> -                y += MUL64(lpc_cof[sb],raw_samples[smp - (sb + 1)]);
> >>>>>>>> +                y += MUL64(lpc_cof[sb], *(raw_samples + smp - (sb + 1)));
> >>>>>>>
> >>>>>>> patch ok
> >>>>>>
> >>>>>> Applied.
> >>>>>>
> >>>>>>>
> >>>>>>> independant of this, it could be that if lpc_cof was reversed
> >>>>>>>
> >>>>>>> for (sb = 0; sb < smp; sb++)
> >>>>>>>     y += MUL64(lpc_cof[sb], raw_samples[sb]);
> >>>>>>
> >>>>>> In the second case the index has to be negative which is not possible
> >>>>>> with this approach.
> >>>>>
> >>>>> i dont see why
> >>>>> also it should be 
> >>>>> raw_samples[sb]
> >>>>> and
> >>>>> raw_samples++
> >>>>
> >>>> Hmm. If you could please elaborate a bit more, I will pick this up
> >>>> tomorrow and will see if I can make it better.
> >>>
> >>> one way:
> >>>
> >>> -    for (; smp < bd->block_length; smp++) {
> >>> -        y = 1 << 19;
> >>> -
> >>> -        for (sb = 0; sb < opt_order; sb++)
> >>> -            y += MUL64(bd->lpc_cof[sb],raw_samples[smp - (sb + 1)]);
> >>> -
> >>> -        raw_samples[smp] -= y >> 20;
> >>> +    for (raw_samples+=smp; raw_samples < raw_samples_end; raw_samples++) {
> >>> +        y = 1 << 19;
> >>> +
> >>> +        for (sb= -opt_order; sb<0; sb++)
> >>> +            y += MUL64(lpc_cof[sb],raw_samples[sb]);
> >>> +
> >>> +        raw_samples[0] -= y >> 20;
> >>>     }
> >>
> >> Ok I see, slightly different patch attached that has a ~20% speed gain
> >> in the second loop alone. Also issue 1657 is avoided by not using smp in
> >> substraction (and yes, tested with 64-bit & gcc 4.0).
> >>
> > 
> >> To have lpc_cof[sb] as well as raw_samples[sb] without math in the
> >> indices, lpc_cof has to be reverted and its pointer shifted to one next
> >> to the end (I think...) - this would make it too obfuscated IMO.
> > 
> > could you try? it would be interresting to see how much faster it is
> 
> Seems not to be faster. Also, reversed order of lpc_cof would shift this
> math from these loops into others and reversion has also to be done
> somewhere.
> 
> 17446 dezicycles in reversed array, 2042 runs, 6 skips
> 18731 dezicycles in reversed array, 2038 runs, 10 skips
> 17446 dezicycles in reversed array, 2042 runs, 6 skips
> 
> 17651 dezicycles in simple math, 2040 runs, 8 skips
> 18497 dezicycles in simple math, 2039 runs, 9 skips
> 17696 dezicycles in simple math, 2038 runs, 10 skips

where can i find an ALS file for trying myself?


> 
> 
> 
> Code snippet:
> 
> >    // reverse linear prediction coefficients for efficiency
> >    for (k = 0; k < opt_order; k++)
> >        lpc_cof_reversed[k] = lpc_cof[opt_order - (k + 1)];
> >
> >START_TIMER;
> >    // reconstruct raw samples
> >    raw_samples = bd->raw_samples + smp;
> >    lpc_cof     = lpc_cof_reversed + opt_order;
> >
> >    for (; smp < bd->block_length; smp++) {
> >        y = 1 << 19;
> >
> >// simple:
> >//        for (sb = 0; sb < opt_order; sb++)
> >//            y += MUL64(bd->lpc_cof[sb], raw_samples[-(sb + 1)]);
> >// reversed:
> >        for (sb = -opt_order; sb < 0; sb++)
> >            y += MUL64(lpc_cof[sb], raw_samples[sb]);
> >
> >        *raw_samples++ -= y >> 20;
> >    }

checking the end via raw_samples < end should be faster
(1 variable less, 1 ++ less)
also if this is on x86_64 try to make sb long instead of int this
too might help

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

> ... defining _GNU_SOURCE...
For the love of all that is holy, and some that is not, don't do that.
-- Luca & Mans
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100106/2b68e2f6/attachment.pgp>



More information about the ffmpeg-devel mailing list