[FFmpeg-devel] Fix MMX dct_quantize for non zigzag_direct scans
Michael Niedermayer
michaelni
Fri May 16 18:32:41 CEST 2008
On Fri, May 16, 2008 at 12:55:03AM +0100, Ramiro Polla wrote:
> Hi,
>
> Michael Niedermayer wrote:
> > On Wed, May 14, 2008 at 11:25:43PM +0100, Ramiro Polla wrote:
[...]
> >> Benchmarking with START/STOP_TIMER isn't very good since the runs can vary
> >> on the time they take depending on last_non_zero. Also the patch changes
> >> not only the MMX code but removes the hack in mpegvideo_enc.c.
> >
> > *_TIMER around the macroblock encode function seems an option
> > also the automatic threshold calculation could be replaced by a constant
> > to avoid excessive skips.
>
> I added *_TIMERs in first and last line of encode_mb_internal. These are
> the results:
>
> around 1048576 runs
> ref new
> time skips time skips
> 42612 3033 44422 2889
> 42551 2981 43440 2712
> 42553 2846 43711 2991
> 41825 2895 44401 3036
> 41812 2925 43691 2878
> 41792 2858 43137 2873
> 42458 2973 43482 2945
> 42388 3078 43553 2766
> 42366 2888 43450 2987
> 42721 3003 43163 2996
> 42210 2964 43956 2954
> 42299 43674 avg
> 341.8448 433.0284 stdev
the skip numbers are too high
try:
-if(tcount<2 || tend - tstart < FFMAX(32*tsum/tcount, 200000)){\
+if(tcount<2 || tend - tstart < FFMAX(32*tsum/tcount, 5000000)){\
or higher
>
> >> The changes for direct zigzag are basically:
> >> - read inverse from context instead of from static array. (shouldn't slow
> >> anything down)
> >> - remove an if(s->alternate_scan) from mpegvideo_enc.c
> >> - add an if(s->intra_scantable.scantable == ff_zigzag_direct) to MMX code.
> >
> > ill probably ok this if the benchmarks are inconclusive ...
> >
> >
> >
> >> I think this is a good solution that shows no measurable speed loss. I can
> >> test more if someone knows a way to make incredibly accurate benchmarking
> >> (<0.1% stdev).
> >>
> >> We could also have two functions of each, one hardcoded to direct zigzag
> >> and another more generic, that can be set in MPV_common_init_mmx() (under
> >> #ifdef CONFIG_SMALL). I attached an example of what could be done. Another
> >> way would be if my first patch is accepted, have an av_always_inline
> >> dct_quantize_xxx_template(..., direct_zigzag), and create
> >> dct_quantize_xxx_direct and dct_quantize_xxx_normal.
> >
> > Iam not in favor of the extra complexity.
> > At least not without clear benchmarks & object file sizes showing that this
> > is worth it ...
> > Also as already said zigzag_direct is used 99% of the time so the other
> > cases simply do not look very relevant to me ...
>
> Is this patch is not accepted, is it ok to duplicate the hack to check
> for last_nonzero_coeff in the mimic encoder?
yes
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080516/baf2d932/attachment.pgp>
More information about the ffmpeg-devel
mailing list