[FFmpeg-devel] [PATCH] run level decode function for wma?&?wmapro

Michael Niedermayer michaelni
Fri Jun 19 20:17:20 CEST 2009


On Thu, Jun 18, 2009 at 05:32:06PM +0200, Sascha Sommer wrote:
> Hi,
> 
> On Mittwoch, 17. Juni 2009, Michael Niedermayer wrote:
> > On Fri, Jun 12, 2009 at 09:36:25AM +0200, Sascha Sommer wrote:
> > > Hi,
> > >
> > > On Donnerstag, 11. Juni 2009, Michael Niedermayer wrote:
> > > > On Thu, Jun 11, 2009 at 10:17:19PM +0200, Sascha Sommer wrote:
> > > > > Hi,
> > > > >
> > > > > attached patch adds a run level decode function that can be used for
> > > > > both wma12 and wmapro.
> > > >
> > > > could you split it into a patch factorizig the code out and one
> > > > changing it ?
> > > > so that code moving around is seperate from functional changes ...
> > > >
> > > > [...]
> > >
> > > Attached patch changes the code to use floats so that it can be shared
> > > with wmapro (that reuses the coefficient decoding buffer as output
> > > buffer)
> >
> > what effect does this have on speed on
> > 1. normal desktop cpus
> > 2. fpu less cpus (limited to ones on which wma is useable of course)
> >
> > [...]
> 
> No idea about fpu less cpus. On my Pentium M 1.6 Ghz, gcc 4.2.1 and the 
> tc316_3.wmv sample, I get the following:
> 
> int
> 29150 dezicycles in rl_decode, 1 runs, 0 skips
> 21955 dezicycles in rl_decode, 2 runs, 0 skips
> 177420 dezicycles in rescale, 1 runs, 0 skips
> 164600 dezicycles in rescale, 2 runs, 0 skips
> 24440 dezicycles in rl_decode, 4 runs, 0 skips
> 104967 dezicycles in rescale, 4 runs, 0 skips
> 25856 dezicycles in rl_decode, 8 runs, 0 skips
> 73722 dezicycles in rescale, 8 runs, 0 skips
> 52556 dezicycles in rl_decode, 16 runs, 0 skips
> 115481 dezicycles in rescale, 16 runs, 0 skips
> 58356 dezicycles in rl_decode, 32 runs, 0 skips
> 123329 dezicycles in rescale, 32 runs, 0 skips
> 57448 dezicycles in rl_decode, 64 runs, 0 skips
> 111245 dezicycles in rescale, 64 runs, 0 skips
> 50067 dezicycles in rl_decode, 128 runs, 0 skips
> 97719 dezicycles in rescale, 128 runs, 0 skips
> 45674 dezicycles in rl_decode, 256 runs, 0 skips
> 95783 dezicycles in rescale, 256 runs, 0 skips
> 
> float
> 20210 dezicycles in rl_decode, 1 runs, 0 skips
> 14300 dezicycles in rl_decode, 2 runs, 0 skips
> 145320 dezicycles in rescale, 1 runs, 0 skips
> 137050 dezicycles in rescale, 2 runs, 0 skips
> 20035 dezicycles in rl_decode, 4 runs, 0 skips
> 87932 dezicycles in rescale, 4 runs, 0 skips
> 22455 dezicycles in rl_decode, 8 runs, 0 skips
> 62375 dezicycles in rescale, 8 runs, 0 skips
> 44031 dezicycles in rl_decode, 16 runs, 0 skips
> 94960 dezicycles in rescale, 16 runs, 0 skips
> 49922 dezicycles in rl_decode, 32 runs, 0 skips
> 102565 dezicycles in rescale, 32 runs, 0 skips
> 49965 dezicycles in rl_decode, 64 runs, 0 skips
> 93084 dezicycles in rescale, 64 runs, 0 skips
> 44103 dezicycles in rl_decode, 128 runs, 0 skips
> 82231 dezicycles in rescale, 128 runs, 0 skips
> 40307 dezicycles in rl_decode, 256 runs, 0 skips
> 81798 dezicycles in rescale, 256 runs, 0 skips
> 
> int16
> 20620 dezicycles in rl_decode, 1 runs, 0 skips
> 15905 dezicycles in rl_decode, 2 runs, 0 skips
> 147660 dezicycles in rescale, 1 runs, 0 skips
> 140955 dezicycles in rescale, 2 runs, 0 skips
> 25125 dezicycles in rl_decode, 4 runs, 0 skips
> 91125 dezicycles in rescale, 4 runs, 0 skips
> 24475 dezicycles in rl_decode, 8 runs, 0 skips
> 64755 dezicycles in rescale, 8 runs, 0 skips
> 43356 dezicycles in rl_decode, 16 runs, 0 skips
> 91768 dezicycles in rescale, 16 runs, 0 skips
> 47933 dezicycles in rl_decode, 32 runs, 0 skips
> 96890 dezicycles in rescale, 32 runs, 0 skips
> 47413 dezicycles in rl_decode, 64 runs, 0 skips
> 87231 dezicycles in rescale, 64 runs, 0 skips
> 41867 dezicycles in rl_decode, 128 runs, 0 skips
> 76848 dezicycles in rescale, 128 runs, 0 skips
> 38301 dezicycles in rl_decode, 256 runs, 0 skips
> 75213 dezicycles in rescale, 256 runs, 0 skips
> 
> The current code is the fastest but 16 bit are not enough for wmapro.

could you change things to a COEF_TYPE or somthing that can be
changed at compile time?

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Frequently ignored awnser#1 FFmpeg bugs should be sent to our bugtracker. User
questions about the command line tools should be sent to the ffmpeg-user ML.
And questions about how to use libav* should be sent to the libav-user ML.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090619/0ac80d85/attachment.pgp>



More information about the ffmpeg-devel mailing list