[FFmpeg-devel] [PATCH] lavfi/hqdn3d/x86: use full register for lowpass.

Wed Sep 19 17:08:10 CEST 2012

LOWPASS starts using the full register for prevsample and cursample,
but currently only returns a valid 32-bits register; later calls will
reuse the value as full register, leading to possibly incorrect
high-order bits.

Fix trac ticket #1752.

Signed-off-by: Nicolas George <nicolas.george at normalesup.org>
---
 libavfilter/x86/hqdn3d.asm |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


I am rather out of my league here. Using gdb, I observed this for the second
line of the offending sample:

"LOWPASS   t1, pixelant, spatial" becomes:

sub    %r10,%rbx              rbx = 0xfffffffffffffff9 = -7
sar    $0x4,%rbx              rbx = 0xffffffffffffffff = -1
movswl (%r9,%rbx,2),%ebx      rbx = 0xfffffff8 = (uint32_t)-8
add    %r10d,%ebx             rbx = 0xffffffff

t1 is r10, pixelant is rbx.

Then, a few lines below "LOWPASS   t0, t1, temporal" becomes:

sub    %rbx,%r11              r11 = 0xffffffff00000001
sar    $0x4,%r11              r11 = 0xfffffffff0000000
movswl (%rax,%r11,2),%r11d    SEGV
add    %ebx,%r11d

The first subtraction on rbx seems to result in an incorrect sign extension,
leading to a segfault when it is used as an index in the "temporal" array.

diff --git a/libavfilter/x86/hqdn3d.asm b/libavfilter/x86/hqdn3d.asm
index 88b9b0d..324642d 100644
--- a/libavfilter/x86/hqdn3d.asm
+++ b/libavfilter/x86/hqdn3d.asm
@@ -27,8 +27,8 @@ SECTION .text
 %if lut_bits != 8
     sar    %1q, 8-lut_bits
 %endif
-    movsx  %1d, word [%3q+%1q*2]
-    add    %1d, %2d
+    movsx  %1q, word [%3q+%1q*2]
+    add    %1q, %2q
 %endmacro
 
 %macro LOAD 3 ; dstreg, x, bitdepth
-- 
1.7.10.4