[FFmpeg-devel] [PATCH] Demuxer for Leitch/Harris' VR native stream format (LXF)
Tomas Härdin
tomas.hardin
Tue Sep 14 07:42:35 CEST 2010
On Mon, 2010-09-13 at 23:20 +0200, Michael Niedermayer wrote:
> On Mon, Sep 13, 2010 at 09:51:22PM +0100, M?ns Rullg?rd wrote:
> > Michael Niedermayer <michaelni at gmx.at> writes:
> >
> > > On Mon, Sep 13, 2010 at 08:47:37PM +0100, M?ns Rullg?rd wrote:
> > >> Michael Niedermayer <michaelni at gmx.at> writes:
> > >>
> > >> > On Mon, Sep 13, 2010 at 04:56:40PM +0100, M?ns Rullg?rd wrote:
> > >> >> Daniel Verkamp <daniel at drv.nu> writes:
> > >> >>
> > >> >> > 2010/9/13 M?ns Rullg?rd <mans at mansr.com>:
> > >> >> >> Tomas H?rdin <tomas.hardin at codemill.se> writes:
> > >> >> >>
> > >> >> >>>> > > > +//returns number of bits set in value
> > >> >> >>>> > > > +static int num_set_bits(uint32_t value) {
> > >> >> >>>> > > > + int ret;
> > >> >> >>>> > > > +
> > >> >> >>>> > > > + for(ret = 0; value; ret += (value & 1), value >>= 1);
> > >> >> >>>> > > > +
> > >> >> >>>> > > > + return ret;
> > >> >> >>>> > > > +}
> > >> >> >>>> > >
> > >> >> >>>> > > if we dont have a population count function yet, than one should be added
> > >> >> >>>> > > to some header in libavutil
> > >> >> >>>> >
> > >> >> >>>> > I couldn't find one. That probably belongs in its own thread though.
> > >> >> >>>> >
> > >> >> >>>> > Which files would such a function belong in - intmath.h/c, common.h or
> > >> >> >>>> > somewhere else? Also, which name would be best: ff_count_bits(),
> > >> >> >>>> > av_count_bits() or something else?
> > >> >> >>>>
> > >> >> >>>> av_popcount()
> > >> >> >>>> would be similar to gccs __builtin_popcount()
> > >> >> >>>
> > >> >> >>> OK. I attached popcount.patch which adds such a function to common.h.
> > >> >> >>> Also bumped minor of lavu. The implementation uses a 16-byte LUT and
> > >> >> >>> therefore counts four bits at a time. I suspect there are better
> > >> >> >>> solutions though. I did verify that it returns exactly the same number
> > >> >> >>> the other implementation does for all 2^32 possible input values.
> > >> >> >>
> > >> >> >> I can't think of a better generic solution off the top of my head.
> > >> >> >
> > >> >> > There is at least one algorithm to do this without loops or lookup
> > >> >> > tables using SWAR tricks, but I haven't benchmarked it:
> > >> >> > http://aggregate.org/MAGIC/#Population Count (Ones Count)
> > >> >>
> > >> >> That method will be several times slower on any modern hardware.
> > >> >
> > >> > hardly
> > >> > the patch needs 32 operations (i assume its unrolled) 8 of which are
> > >> > table lookups which might be less than fast on some hw
> > >> > the aggregate.org code needs 15 operations the suggested modification for
> > >> > athlons (aka modernb cpus with fast multipler) would reduce that to 12
> > >>
> > >> Did you count the operations needed to construct the constants? I
> > >> think not.
> > >
> > > i used simple mathematical counting of operations as you surely see
> > > and as you know constants dont count there. a counting of cpu cycles
> > > of your specific embeded cpu on which constant construction is
> > > expensice i could not do, iam sorry.
> >
> > Naive operation counting is useless. Even division counts a only 1
> > there, while everybody knows it takes many times longer than addition
> > on real hardware. Even on modern hardware Multiplication also takes
> > significantly longer than addition.
>
> theres neither multiplication nor divission in the 32 and 15 operations
Updated the patch per M?ns suggestions (unrolled, static table). I kept
it as simple as possible for now, assuming the compiler will ignore
shifts by zero. I didn't perform any detailed speedup analysis, but it's
around 50% faster.
/Tomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: popcount2.patch
Type: text/x-patch
Size: 1671 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100914/0967a5c9/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100914/0967a5c9/attachment.pgp>
More information about the ffmpeg-devel
mailing list