[FFmpeg-cvslog] avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H
zjh8890
git at videolan.org
Sun Dec 13 01:47:18 CET 2015
ffmpeg | branch: release/2.8 | zjh8890 <243186085 at qq.com> | Sun Nov 22 00:07:35 2015 +0800| [cd83f899c94f691b045697d12efa21f83eb2329f] | committer: Michael Niedermayer
avcodec/aarch64/neon.S: Update neon.s for transpose_4x4H
The transpose_4x4H is wrong which cost me much time to find this bug. The orders of r2 and r3 are wrong,
this bug waste me much time while I make aarch64 arm instruction which used the function.
(cherry picked from commit c18176bd551b4616757080376707637e30547fd0)
Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=cd83f899c94f691b045697d12efa21f83eb2329f
---
libavcodec/aarch64/neon.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavcodec/aarch64/neon.S b/libavcodec/aarch64/neon.S
index 619aec6..a227cbd 100644
--- a/libavcodec/aarch64/neon.S
+++ b/libavcodec/aarch64/neon.S
@@ -107,8 +107,8 @@
.macro transpose_4x4H r0, r1, r2, r3, r4, r5, r6, r7
trn1 \r4\().4H, \r0\().4H, \r1\().4H
trn2 \r5\().4H, \r0\().4H, \r1\().4H
- trn1 \r7\().4H, \r3\().4H, \r2\().4H
- trn2 \r6\().4H, \r3\().4H, \r2\().4H
+ trn1 \r7\().4H, \r2\().4H, \r3\().4H
+ trn2 \r6\().4H, \r2\().4H, \r3\().4H
trn1 \r0\().2S, \r4\().2S, \r7\().2S
trn2 \r3\().2S, \r4\().2S, \r7\().2S
trn1 \r1\().2S, \r5\().2S, \r6\().2S
More information about the ffmpeg-cvslog
mailing list