[FFmpeg-devel] [PATCH] Fix 4XM decoding on big-endian and unaligned reads

Reimar Döffinger Reimar.Doeffinger
Thu Nov 11 22:14:38 CET 2010


On Thu, Nov 11, 2010 at 10:05:36PM +0100, Vitor Sessak wrote:
> On 11/11/2010 09:45 PM, Reimar D?ffinger wrote:
> >On Thu, Nov 11, 2010 at 09:31:51PM +0100, Vitor Sessak wrote:
> >>Index: libavcodec/4xm.c
> >>===================================================================
> >>--- libavcodec/4xm.c	(revision 25719)
> >>+++ libavcodec/4xm.c	(working copy)
> >>@@ -260,6 +260,21 @@
> >>      }
> >>  }
> >>
> >>+#if HAVE_BIGENDIAN
> >>+#define LE_CENTRIC_MUL(dst, src, scale, dc) \
> >>+    { \
> >>+        unsigned tmpval = ((src)[1]<<  16) + (src)[0];  \
> >>+        tmpval = tmpval * (scale) + (dc);               \
> >>+        (dst)[0] = tmpval&  0xFFFF;                     \
> >>+        (dst)[1] = tmpval>>  16;                        \
> >>+    }
> >>+#else
> >>+#define LE_CENTRIC_MUL(dst, src, scale, dc) \
> >>+    { \
> >>+        *((uint32_t *) (dst)) = AV_RL32(src) * (scale) + (dc); \
> >>+    }
> >>+#endif
> >
> >
> >Hmm.. Isn't this the same as
> >uint32_t tmp = AV_RN32(src);
> >#if HAVE_BIGENDIAN
> >tmp = (tmp<<  16) | (tmp>>  16);
> >#endif
> >tmp = tmp * scale + dc;
> >#if HAVE_BIGENDIAN
> >tmp = (tmp<<  16) | (tmp>>  16);
> >#endif
> >AV_RN32A(dst, tmp);
> >
> >Note that the two things under #if should compile
> >into a single rotate instruction.
> >Which one is faster overall depends on whether unaligned
> >accesses are fast or not...
> 
> Look pretty equivalent to my patch to me and I have no particular
> preference for either. No idea also of what is faster.

First, the last one should be AV_WN32A of course.
Second, the write part definitely is faster.
For the read part it depends on whether unaligned reads are available,
but it seems easy enough to add a special case if someone actually
cares about the performance when they aren't available.



More information about the ffmpeg-devel mailing list