[FFmpeg-devel] Fix MMX dct_quantize for non zigzag_direct scans

Fri May 16 18:32:41 CEST 2008

On Fri, May 16, 2008 at 12:55:03AM +0100, Ramiro Polla wrote:
> Hi,
> 
> Michael Niedermayer wrote:
> > On Wed, May 14, 2008 at 11:25:43PM +0100, Ramiro Polla wrote:
[...]
> >> Benchmarking with START/STOP_TIMER isn't very good since the runs can vary 
> >> on the time they take depending on last_non_zero. Also the patch changes 
> >> not only the MMX code but removes the hack in mpegvideo_enc.c.
> > 
> > *_TIMER around the macroblock encode function seems an option
> > also the automatic threshold calculation could be replaced by a constant
> > to avoid excessive skips.
> 
> I added *_TIMERs in first and last line of encode_mb_internal. These are 
> the results:
> 
> around 1048576 runs
> ref          new
> time   skips time   skips
> 42612  3033  44422  2889
> 42551  2981  43440  2712
> 42553  2846  43711  2991
> 41825  2895  44401  3036
> 41812  2925  43691  2878
> 41792  2858  43137  2873
> 42458  2973  43482  2945
> 42388  3078  43553  2766
> 42366  2888  43450  2987
> 42721  3003  43163  2996
> 42210  2964  43956  2954
> 42299        43674        avg
> 341.8448     433.0284     stdev

the skip numbers are too high

try:
-if(tcount<2 || tend - tstart < FFMAX(32*tsum/tcount, 200000)){\
+if(tcount<2 || tend - tstart < FFMAX(32*tsum/tcount, 5000000)){\
or higher


> 
> >> The changes for direct zigzag are basically:
> >> - read inverse from context instead of from static array. (shouldn't slow 
> >> anything down)
> >> - remove an if(s->alternate_scan) from mpegvideo_enc.c
> >> - add an if(s->intra_scantable.scantable == ff_zigzag_direct) to MMX code.
> > 
> > ill probably ok this if the benchmarks are inconclusive ...
> > 
> > 
> > 
> >> I think this is a good solution that shows no measurable speed loss. I can 
> >> test more if someone knows a way to make incredibly accurate benchmarking 
> >> (<0.1% stdev).
> >>
> >> We could also have two functions of each, one hardcoded to direct zigzag 
> >> and another more generic, that can be set in MPV_common_init_mmx() (under 
> >> #ifdef CONFIG_SMALL). I attached an example of what could be done. Another 
> >> way would be if my first patch is accepted, have an av_always_inline 
> >> dct_quantize_xxx_template(..., direct_zigzag), and create 
> >> dct_quantize_xxx_direct and dct_quantize_xxx_normal.
> > 
> > Iam not in favor of the extra complexity.
> > At least not without clear benchmarks & object file sizes showing that this
> > is worth it ...
> > Also as already said zigzag_direct is used 99% of the time so the other
> > cases simply do not look very relevant to me ...
> 
> Is this patch is not accepted, is it ok to duplicate the hack to check 
> for last_nonzero_coeff in the mimic encoder? 

yes

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080516/baf2d932/attachment.pgp>