[FFmpeg-devel] [patch v4, v5] libpostproc: mmx code uses stack below %esp

Yuriy Kaminskiy yumkam
Tue Feb 2 14:47:44 CET 2010

On 30.01.2010 15:48, Michael Niedermayer wrote:
> On Fri, Jan 29, 2010 at 10:24:11PM +0300, Yuriy Kaminskiy wrote:
>> ==32414==
>> ==32414== Invalid write of size 8
>> ==32414==    at 0x874CF44: postProcess_MMX2 (in /path/to/mplayer)
>> ==32414==  Address 0xbeffa9d0 is just below the stack ptr.  To suppress, use:
>> --workaround-gcc296-bugs=yes
>> I, of course, don't use gcc-2.96 ;-)
>> I've looked into libpostproc/postprocess_template.c, and, indeed, it uses memory
>> below %esp:
>> === cut ===
>> static inline void RENAME(doVertDefFilter)(uint8_t src[], int stride, PPContext *c)
>> {
>> [...]
>>     __asm__ volatile(
>>         "pxor %%mm7, %%mm7                      \n\t"
>>         "lea -40(%%"REG_SP"), %%"REG_c"         \n\t" // make space for 4 8-byte
>> vars
>>         "and "ALIGN_MASK", %%"REG_c"            \n\t" // align
>> static inline void RENAME(dering)(uint8_t src[], int stride, PPContext *c)
>> static av_always_inline void RENAME(do_a_deblock)(uint8_t *src, int step, int
>> stride, PPContext *c){
>> === cut ===
>> Not sure if this *must* be fixed, but it feels unsafe, so...
>> Patch attached; doVertDefFilter and do_a_deblock changes should not affect
>> speed, not sure about dering one.
> please put a START/STOP TIMER around things to make sure

tried, failed to acquire conclusive results; original/v3/v4 variants sometimes
[tiny bit] slower, sometimes [even less] faster (for all functions; at least, my
assumption that doVertDef and do_a changes "safer", and dering "more
problematic" seems wrong; btw, doVertDef should be checked with MMX2 and 3DNOW
disabled, dering - with MMX2 or 3DNOW enabled [do_a_deblock - either way]).

>> +	DECLARE_ALIGNED(8, uint64_t, tmp)[4]; // make space for 4 8-byte vars
> tabs are forbidden in our svn

och; i know; sorry

>> -        "movq (%%"REG_c"), %%mm2                \n\t" // 2L0 - 5L1 + 5L2 - 2L3
>> -        "movq 8(%%"REG_c"), %%mm3               \n\t" // 2H0 - 5H1 + 5H2 - 2H3
>> +        "movq %3, %%mm2                         \n\t" // 2L0 - 5L1 + 5L2 - 2L3
>> +        "movq 8+%3, %%mm3                       \n\t" // 2H0 - 5H1 + 5H2 - 2H3
> i have a bad feeling about this, what makes you belive this will work
> better than the last time such syntax was tried
> also my gcc replaces "o" with 4+%0 happily by 4+(%edx)

works for me (tm), and it should work with gas, because it is supposed to be
used in this way; well, and for cc not to use 1234(%esp) here would be very
strange (not that we can rely on that). anyway, alternative version [v4], that
does not use 8+%3/"o" attached;
or v5, that does not attempt to save one register in dering (seems, that was not
needed even for -fPIC -fomit-frame-pointer, and -fPIC -fno-omit-frame-pointer
was broken before [plus postproc asm is not PIC anyway]; that's on 4.1.2, what
with other gcc versions - don't know, maybe v4 safer).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: postproc-invalid-stack-4.patch
Type: text/x-diff
Size: 10876 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100202/66f0a03c/attachment.patch>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: postproc-invalid-stack-4-5.diff
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100202/66f0a03c/attachment.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: postproc-invalid-stack-5.patch
Type: text/x-diff
Size: 10476 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100202/66f0a03c/attachment-0001.patch>

More information about the ffmpeg-devel mailing list