[FFmpeg-devel] [PATCH 0/9] DCA (DTS) decoder optimisations for ARMv6

Tue Jul 16 14:07:06 CEST 2013

Hi Michael,

Thanks very much for your testing. I think the problem with the files
that you had trouble building was that I didn't anticipate that anyone
would be trying to build them as Thumb code. Perhaps you could try making
the following alterations - if they work, then I'll merge them into the
full patch series and repost it:

diff --git a/libavcodec/arm/dcadsp_vfp.S b/libavcodec/arm/dcadsp_vfp.S
index 2a86209..671c8a2 100644
--- a/libavcodec/arm/dcadsp_vfp.S
+++ b/libavcodec/arm/dcadsp_vfp.S
@@ -470,7 +470,9 @@ VFP     vldr    SCALE, [sp, #3*4]
          subs    COUNT, COUNT, #1
          bne     7b

-        sub     sp, fp, #(8+8)*4
+A       sub     sp, fp, #(8+8)*4
+T       subs    fp, fp, #(8+8)*4
+T       mov     sp, fp
          vpop    {s16-s23}
  VFP     pop     {a3-a4,v1-v3,v5,fp,pc}
  NOVFP   pop     {a4,v1-v5,fp,pc}
diff --git a/libavcodec/arm/mdct_vfp.S b/libavcodec/arm/mdct_vfp.S
index 8a924fc..9dbffb1 100644
--- a/libavcodec/arm/mdct_vfp.S
+++ b/libavcodec/arm/mdct_vfp.S
@@ -151,7 +151,9 @@ J3      .req    lr
  function ff_imdct_half_vfp, export=1
          ldr     ip, [CONTEXT, #5*4]         @ mdct_bits
          teq     ip, #6
-        bne     ff_imdct_half_c             @ only case currently accelerated is the one used by DCA
+A       bne     ff_imdct_half_c             @ only case currently accelerated is the one used by DCA
+T       ldrne   ip, =ff_imdct_half_c
+T       bxne    ip

   .set n, 1<<6
   .set n2, n/2


> some part of this patchset causes
> fate-acodec-dca2 and fate-dts
> to fail, they dont fail without the patchsetfor me

Well, this is strange. Those both pass for me. The ones I'm having
trouble with are

fate-aac-al07_96
fate-aac-al15_44
fate-aac-aref-encode
fate-aac-ln-encode

but it sounds like they're OK for you! They fail for me even without my
patches applied.

> (test done with qemu and without the patches that failed to build)

The way I see it, there are two most likely causes for your problems:
* softfp/hardfp ABI differences (I've tested using hardfp, do you know
   what you were using?)
* perhaps qemu doesn't emulate short vectors?

I have put together a quick test program below to see if short vectors
are working. I would appreciate it if you could give it a quick try for
me. It prints "Success" on my physical ARM11.

Thanks,
Ben


::::::::::::::
testshvec.c
::::::::::::::
#include <stdio.h>
#include <stdlib.h>

extern void test(const float *, const float *, float *);

int main(void)
{
     /* Pass arguments and results via pointers to sidestep
      * softfp/hardfp ABI issues */
     float in0 = 0;
     float in1 = 1;
     float result;
     test(&in0, &in1, &result);
     if (result == 1)
         printf("Success\n");
     else
         printf("Failure: returned %g\n", result);
     return EXIT_SUCCESS;
}
::::::::::::::
testshvec_asm.S
::::::::::::::
.global test
.func test
test:
         vldr    s0, [a1]
         vldr    s1, [a2]
         vmrs    a1, FPSCR
         ldr     a2, =0x03010000 @ short vectors of length 2, stride 1
         vmsr    FPSCR, a2
         vmov    s9, s0 @ should set s9 and s10 to 0
         vmov    s8, s1 @ should set s8 and s9 to 1
         vstr    s9, [a3]
         vmsr    FPSCR, a1
         bx      lr
.endfunc