[FFmpeg-devel] [PATCH 0/9] DCA (DTS) decoder optimisations for ARMv6

Ben Avison bavison at riscosopen.org
Tue Jul 16 14:07:06 CEST 2013

Hi Michael,

Thanks very much for your testing. I think the problem with the files
that you had trouble building was that I didn't anticipate that anyone
would be trying to build them as Thumb code. Perhaps you could try making
the following alterations - if they work, then I'll merge them into the
full patch series and repost it:

diff --git a/libavcodec/arm/dcadsp_vfp.S b/libavcodec/arm/dcadsp_vfp.S
index 2a86209..671c8a2 100644
--- a/libavcodec/arm/dcadsp_vfp.S
+++ b/libavcodec/arm/dcadsp_vfp.S
@@ -470,7 +470,9 @@ VFP     vldr    SCALE, [sp, #3*4]
          subs    COUNT, COUNT, #1
          bne     7b

-        sub     sp, fp, #(8+8)*4
+A       sub     sp, fp, #(8+8)*4
+T       subs    fp, fp, #(8+8)*4
+T       mov     sp, fp
          vpop    {s16-s23}
  VFP     pop     {a3-a4,v1-v3,v5,fp,pc}
  NOVFP   pop     {a4,v1-v5,fp,pc}
diff --git a/libavcodec/arm/mdct_vfp.S b/libavcodec/arm/mdct_vfp.S
index 8a924fc..9dbffb1 100644
--- a/libavcodec/arm/mdct_vfp.S
+++ b/libavcodec/arm/mdct_vfp.S
@@ -151,7 +151,9 @@ J3      .req    lr
  function ff_imdct_half_vfp, export=1
          ldr     ip, [CONTEXT, #5*4]         @ mdct_bits
          teq     ip, #6
-        bne     ff_imdct_half_c             @ only case currently accelerated is the one used by DCA
+A       bne     ff_imdct_half_c             @ only case currently accelerated is the one used by DCA
+T       ldrne   ip, =ff_imdct_half_c
+T       bxne    ip

   .set n, 1<<6
   .set n2, n/2

> some part of this patchset causes
> fate-acodec-dca2 and fate-dts
> to fail, they dont fail without the patchsetfor me

Well, this is strange. Those both pass for me. The ones I'm having
trouble with are


but it sounds like they're OK for you! They fail for me even without my
patches applied.

> (test done with qemu and without the patches that failed to build)

The way I see it, there are two most likely causes for your problems:
* softfp/hardfp ABI differences (I've tested using hardfp, do you know
   what you were using?)
* perhaps qemu doesn't emulate short vectors?

I have put together a quick test program below to see if short vectors
are working. I would appreciate it if you could give it a quick try for
me. It prints "Success" on my physical ARM11.


#include <stdio.h>
#include <stdlib.h>

extern void test(const float *, const float *, float *);

int main(void)
     /* Pass arguments and results via pointers to sidestep
      * softfp/hardfp ABI issues */
     float in0 = 0;
     float in1 = 1;
     float result;
     test(&in0, &in1, &result);
     if (result == 1)
         printf("Failure: returned %g\n", result);
     return EXIT_SUCCESS;
.global test
.func test
         vldr    s0, [a1]
         vldr    s1, [a2]
         vmrs    a1, FPSCR
         ldr     a2, =0x03010000 @ short vectors of length 2, stride 1
         vmsr    FPSCR, a2
         vmov    s9, s0 @ should set s9 and s10 to 0
         vmov    s8, s1 @ should set s8 and s9 to 1
         vstr    s9, [a3]
         vmsr    FPSCR, a1
         bx      lr

More information about the ffmpeg-devel mailing list