[FFmpeg-devel] MPEG-2 Acceleration Refactor

Greg Hulands ghulands
Sun Jun 17 03:17:21 CEST 2007


Hi Michael,

On 16/06/2007, at 3:32 PM, Michael Niedermayer wrote:

> Hi
>
> On Sat, Jun 16, 2007 at 11:45:43AM -0700, Greg Hulands wrote:
> [...]
>>>> bench: utime=14.061s
>>>
>>> a single benchmark run is useless, 5 is minimum
>>> putting START/STOP_TIMER around the call to the changed function
>>> would also be a good idea
>>>
>>> also dont use --disable-mmx
>>
>> I did run the tests 5 times as a friend suggested and they were all
>> in the same ballpark, both in total time and the difference between
>> them, so I just put in the results for the last benchmark.
>
> so the patch slows the code down by 1% ?
> if so its rejected

0.73% but my guess that means you will still reject it :)

Ok, I have put in the START/STOP_TIMER stuff and the results are  
below. I ran 5 tests each with and without the patch. The patch is  
actually _faster_ so I'm not sure why the timings for the original  
benchmarking showed it being slower. Maybe you would have more of an  
idea than me.

Without Patch

Tiger1080:~/ffmpeg ghulands$ ./ffmpeg -benchmark -threads 1 -i ~/ 
Desktop/720p-short.m2v -f rawvideo -y /dev/null -v 5
FFmpeg version SVN-r9329, Copyright (c) 2000-2007 Fabrice Bellard, et  
al.
   configuration: --disable-ffserver --disable-mmx --enable-pthreads
   libavutil version: 49.4.0
   libavcodec version: 51.40.4
   libavformat version: 51.12.1
   built on Jun 15 2007 19:46:35, gcc: 4.0.1 (Apple Computer, Inc.  
build 5367)

Seems stream 0 codec frame rate differs from container frame rate:  
59.94 (60000/1001) -> 25.00 (25/1)
Input #0, mpegvideo, from '/Users/ghulands/Desktop/720p-short.m2v':
   Duration: 00:00:08.0, start: 0.000000, bitrate: 38867 kb/s
   Stream #0.0: Video: mpeg2video, yuv420p, 1280x720, 38810 kb/s,  
25.00 fps(r)
Output #0, rawvideo, to '/dev/null':
   Stream #0.0, 1/90000: Video: rawvideo, yuv420p, 1280x720, 1/25,  
q=2-31, 200 kb/s, 25.00 fps(c)
Stream mapping:
   Stream #0.0 -> #0.0
Press [q] to stop encoding
40541 dezicycles in mpeg_decode_mb(), 4096 runs, 0 skips
4156 dezicycles in mpeg2_decode_block_intra(), 32753 runs, 15 skips
13440 dezicycles in mpeg2_decode_block_non_intra(), 1 runs, 0 skips
16620 dezicycles in mpeg2_decode_block_non_intra(), 2 runs, 0 skips
11520 dezicycles in mpeg2_decode_block_non_intra(), 4 runs, 0 skips
8205 dezicycles in mpeg2_decode_block_non_intra(), 8 runs, 0 skips
6802 dezicycles in mpeg2_decode_block_non_intra(), 16 runs, 0 skips
5501 dezicycles in mpeg2_decode_block_non_intra(), 32 runs, 0 skips
4741 dezicycles in mpeg2_decode_block_non_intra(), 64 runs, 0 skips
4163 dezicycles in mpeg2_decode_block_non_intra(), 128 runs, 0 skips
4046 dezicycles in mpeg2_decode_block_non_intra(), 256 runs, 0 skips
3769 dezicycles in mpeg2_decode_block_non_intra(), 512 runs, 0 skips
3669 dezicycles in mpeg2_decode_block_non_intra(), 1023 runs, 1 skips
3299 dezicycles in mpeg2_decode_block_non_intra(), 2046 runs, 2 skips
3016 dezicycles in mpeg2_decode_block_non_intra(), 4092 runs, 4 skips
36054 dezicycles in mpeg_decode_mb(), 8173 runs, 19 skips
2980 dezicycles in mpeg2_decode_block_non_intra(), 8186 runs, 6 skips
2774 dezicycles in mpeg2_decode_block_non_intra(), 16373 runs, 11 skips
26346 dezicycles in mpeg_decode_mb(), 16358 runs, 26 skips
2311 dezicycles in mpeg2_decode_block_non_intra(), 32752 runs, 16 skips
19528 dezicycles in mpeg_decode_mb(), 32724 runs, 44 skips
2013 dezicycles in mpeg2_decode_block_non_intra(), 65518 runs, 18 skips
3813 dezicycles in mpeg2_decode_block_intra(), 65515 runs, 21 skips
1864 dezicycles in mpeg2_decode_block_non_intra(), 131046 runs, 26 skips
17136 dezicycles in mpeg_decode_mb(), 65469 runs, 67 skips
frame=   34 fps=  0 q=0.0 size=   45900kB time=1.4  
bitrate=276480.0kbits/s dup=01794 dezicycles in  
mpeg2_decode_block_non_intra(), 262094 runs, 50 skips
15010 dezicycles in mpeg_decode_mb(), 130965 runs, 107 skips
3316 dezicycles in mpeg2_decode_block_intra(), 131037 runs, 35 skips
frame=   71 fps= 68 q=0.0 size=   95850kB time=2.8  
bitrate=276480.0kbits/s dup=01721 dezicycles in  
mpeg2_decode_block_non_intra(), 524214 runs, 74 skips
14539 dezicycles in mpeg_decode_mb(), 261997 runs, 147 skips
frame=  107 fps= 69 q=0.0 size=  144450kB time=4.3  
bitrate=276480.0kbits/s dup=03324 dezicycles in  
mpeg2_decode_block_intra(), 262085 runs, 59 skips
frame=  144 fps= 70 q=0.0 size=  194400kB time=5.8  
bitrate=276480.0kbits/s dup=01709 dezicycles in  
mpeg2_decode_block_non_intra(), 1048444 runs, 132 skips
14091 dezicycles in mpeg_decode_mb(), 524060 runs, 228 skips
frame=  180 fps= 70 q=0.0 size=  243000kB time=7.2  
bitrate=276480.0kbits/s dup=0frame=  216 fps= 71 q=0.0 size=   
291600kB time=8.6 bitrate=276480.0kbits/s dup=03247 dezicycles in  
mpeg2_decode_block_intra(), 524142 runs, 146 skips
frame=  252 fps= 71 q=0.0 size=  340200kB time=10.1  
bitrate=276480.0kbits/s dup=frame=  288 fps= 71 q=0.0 size=  388800kB  
time=11.5 bitrate=276480.0kbits/s dup=1693 dezicycles in  
mpeg2_decode_block_non_intra(), 2096921 runs, 231 skips
frame=  325 fps= 71 q=0.0 size=  438750kB time=13.0  
bitrate=276480.0kbits/s dup=13997 dezicycles in mpeg_decode_mb(),  
1048113 runs, 463 skips
frame=  361 fps= 71 q=0.0 size=  487350kB time=14.4  
bitrate=276480.0kbits/s dup=frame=  398 fps= 71 q=0.0 size=  537300kB  
time=15.9 bitrate=276480.0kbits/s dup=0 frame=  432 fps= 71 q=0.0  
size=  583200kB time=17.3 bitrate=276480.0kbits/s du3253 dezicycles  
in mpeg2_decode_block_intra(), 1048296 runs, 280 skipsits/s dup=0 drop=0
1686 dezicycles in mpeg2_decode_block_non_intra(), 4193795 runs, 509  
skipss dup=0 drop=0
13996 dezicycles in mpeg_decode_mb(), 2096084 runs, 1068  
skips6480.0kbits/s dup=0 drop=0
3243 dezicycles in mpeg2_decode_block_intra(), 2096532 runs, 620  
skipsits/s dup=0 drop=0
1705 dezicycles in mpeg2_decode_block_non_intra(), 8387615 runs, 993  
skipss dup=0 drop=0
frame= 1244 fps= 71 q=0.0 Lsize= 1679400kB time=49.8  
bitrate=276480.0kbits/s dup=0 drop=0
video:1679400kB audio:0kB global headers:0kB muxing overhead 0.000000%
bench: utime=15.352s
bench: utime=15.262s
bench: utime=15.251s
bench: utime=15.261s
bench: utime=15.258s

With Patch

Tiger1080:~/htp/trunk/ffmpeg ghulands$ ./ffmpeg -benchmark -threads 1  
-i ~/Desktop/720p-short.m2v -f rawvideo -y /dev/null -v 5
FFmpeg version SVN-r9339, Copyright (c) 2000-2007 Fabrice Bellard, et  
al.
   configuration: --disable-ffserver --disable-mmx --enable-pthreads
   libavutil version: 49.4.0
   libavcodec version: 51.40.4
   libavformat version: 51.12.1
   built on Jun 16 2007 17:46:46, gcc: 4.0.1 (Apple Computer, Inc.  
build 5367)

Seems stream 0 codec frame rate differs from container frame rate:  
59.94 (60000/1001) -> 25.00 (25/1)
Input #0, mpegvideo, from '/Users/ghulands/Desktop/720p-short.m2v':
   Duration: 00:00:08.0, start: 0.000000, bitrate: 38867 kb/s
   Stream #0.0: Video: mpeg2video, yuv420p, 1280x720, 38810 kb/s,  
25.00 fps(r)
Output #0, rawvideo, to '/dev/null':
   Stream #0.0, 1/90000: Video: rawvideo, yuv420p, 1280x720, 1/25,  
q=2-31, 200 kb/s, 25.00 fps(c)
Stream mapping:
   Stream #0.0 -> #0.0
Press [q] to stop encoding
31781 dezicycles in mpeg_decode_mb(), 4095 runs, 1 skips
3727 dezicycles in mpeg2_decode_block_intra(), 32762 runs, 6 skips
10440 dezicycles in mpeg2_decode_block_non_intra(), 1 runs, 0 skips
11280 dezicycles in mpeg2_decode_block_non_intra(), 2 runs, 0 skips
7260 dezicycles in mpeg2_decode_block_non_intra(), 4 runs, 0 skips
5145 dezicycles in mpeg2_decode_block_non_intra(), 8 runs, 0 skips
3885 dezicycles in mpeg2_decode_block_non_intra(), 16 runs, 0 skips
3026 dezicycles in mpeg2_decode_block_non_intra(), 32 runs, 0 skips
2471 dezicycles in mpeg2_decode_block_non_intra(), 64 runs, 0 skips
2116 dezicycles in mpeg2_decode_block_non_intra(), 128 runs, 0 skips
2038 dezicycles in mpeg2_decode_block_non_intra(), 256 runs, 0 skips
1969 dezicycles in mpeg2_decode_block_non_intra(), 512 runs, 0 skips
2012 dezicycles in mpeg2_decode_block_non_intra(), 1023 runs, 1 skips
1978 dezicycles in mpeg2_decode_block_non_intra(), 2047 runs, 1 skips
1995 dezicycles in mpeg2_decode_block_non_intra(), 4095 runs, 1 skips
30173 dezicycles in mpeg_decode_mb(), 8173 runs, 19 skips
2082 dezicycles in mpeg2_decode_block_non_intra(), 8190 runs, 2 skips
2030 dezicycles in mpeg2_decode_block_non_intra(), 16379 runs, 5 skips
22137 dezicycles in mpeg_decode_mb(), 16360 runs, 24 skips
1904 dezicycles in mpeg2_decode_block_non_intra(), 32758 runs, 10 skips
17206 dezicycles in mpeg_decode_mb(), 32739 runs, 29 skips
1795 dezicycles in mpeg2_decode_block_non_intra(), 65525 runs, 11 skips
3616 dezicycles in mpeg2_decode_block_intra(), 65519 runs, 17 skips
1739 dezicycles in mpeg2_decode_block_non_intra(), 131058 runs, 14 skips
15762 dezicycles in mpeg_decode_mb(), 65498 runs, 38 skips
1719 dezicycles in mpeg2_decode_block_non_intra(), 262123 runs, 21  
skips/s dup=0 drop=0
14130 dezicycles in mpeg_decode_mb(), 131027 runs, 45 skips
3230 dezicycles in mpeg2_decode_block_intra(), 131042 runs, 30 skips
1671 dezicycles in mpeg2_decode_block_non_intra(), 524254 runs, 34  
skips/s dup=0 drop=0
13927 dezicycles in mpeg_decode_mb(), 262070 runs, 74 skips
3292 dezicycles in mpeg2_decode_block_intra(), 262099 runs, 45  
skipsbits/s dup=0 drop=0
1671 dezicycles in mpeg2_decode_block_non_intra(), 1048519 runs, 57  
skipss dup=0 drop=0
13585 dezicycles in mpeg_decode_mb(), 524175 runs, 113 skips
3242 dezicycles in mpeg2_decode_block_intra(), 524203 runs, 85  
skipsbits/s dup=0 drop=0
1660 dezicycles in mpeg2_decode_block_non_intra(), 2097053 runs, 99  
skips/s dup=0 drop=0
13528 dezicycles in mpeg_decode_mb(), 1048393 runs, 183 skips
3257 dezicycles in mpeg2_decode_block_intra(), 1048424 runs, 152  
skipsits/s dup=0 drop=0
1654 dezicycles in mpeg2_decode_block_non_intra(), 4194089 runs, 215  
skipss dup=0 drop=0
13517 dezicycles in mpeg_decode_mb(), 2096762 runs, 390  
skips76480.0kbits/s dup=0 drop=0
3248 dezicycles in mpeg2_decode_block_intra(), 2096863 runs, 289  
skipsits/s dup=0 drop=0
1674 dezicycles in mpeg2_decode_block_non_intra(), 8388223 runs, 385  
skipss dup=0 drop=0
frame= 1244 fps= 73 q=0.0 Lsize= 1679400kB time=49.8  
bitrate=276480.0kbits/s dup=0 drop=0
video:1679400kB audio:0kB global headers:0kB muxing overhead 0.000000%
bench: utime=15.080s
bench: utime=15.120s
bench: utime=15.137s
bench: utime=15.122s
bench: utime=15.135s


Cheers,
Greg





More information about the ffmpeg-devel mailing list