[Ffmpeg-devel] patch: altivec optimizations for h264 decoder

Mauricio Alvarez alvarez
Mon Feb 6 11:24:14 CET 2006


Hi all

As a part of my academic research on architectures for video
decoding I am doing some optimizations to the h.264 decoder for the ppc
architecture using altivec, and I want to submit them back to the ffmpeg
project.

I have implemented the following functions:
- luma motion compensation for 8x8 and 4x4 pixels blocks
- chroma motion compensation for 4x4 pixel blocks
- inverse transforms: 8x8 and 4x4

i) for the 4x4 inverse transform I have implemented two versions: the
first one, called ff_h264_idct_add_altivec, implements the transform
with the same algorith as the c version. The second one is
ff_h264_idct_add_altivec_mat which implements an optimized matrix
multiply algorithm described in Chen paper [1]. In the altivec
implementation the second (matrix) algorithm has a speed-up of 2.95 with
respect to the C version while the first version has 1.55.

ii)The 8x8 luma motion compensation implementation with altivec has a
2.12 speed-up compared with the C version and the 4x4 has 1.30.

iii)The chroma 4x4 motion compensation has a speed-up of 1.85 again
compared with the C version.

iv) I have performed a regresion test and the new optimizations passed
it ok. Also I have decocoded some videos[2] coded with the JM and x264
encoders at HD resolution and all of them decode well.
The speed-ups for the sequences used is described in the next table:

Coding options:
- resolution: 1920x1088p25,
- profile: main, level: 5.0
- qp for I,P slices: 22
- qp for B slices: 24
- coded sequence: I-P-B-B-P-B-B
- direct mode: temporal
- Weighted prediction


sequence	ffmpeg-cvs	ffmpeg-patch
		time [s]	time [s]	speed-up
pedestrian	11,89		10,15		17,14 %
riverbed	19,11		17,73		7,78 %
blue sky	11,33		10,13		11,85 %
rush hour	12,34		11,24		9,79 %
AVG						11,64 %

I hope the patch is OK for FFMPEG developers. Any comments or suggestion
to improve the patch are welcome.

Mauricio Alvarez
Department of Computer Architecture
Universitat Polit?cnica de Catalunya
Barcelona-Spain.

[1] Yen-Kuang Chen, Eric Q. Li, Xiaosong Zhou?, and Steven Ge.
Implementation of H.264 Encoder and Decoder on Personal Computers.
Journal of Visual Communication and Image Representation, July 2005.

[2] Mpeg test sequences at HD resolution
http://www.ldv.ei.tum.de/liquid.php?page=70

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch1.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060206/f997a9cd/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch2.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060206/f997a9cd/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch3.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060206/f997a9cd/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch4.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060206/f997a9cd/attachment-0003.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch5.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060206/f997a9cd/attachment-0004.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch6.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20060206/f997a9cd/attachment-0005.txt>



More information about the ffmpeg-devel mailing list