[FFmpeg-devel] [PATCH 2/5] truehd: break out part of rematrix_channels into platform-specific callback.

Thu Mar 20 23:03:36 CET 2014

On Thu, 20 Mar 2014 19:10:13 -0000, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Mar 20, 2014 at 05:59:54PM -0000, Ben Avison wrote:
>> That would only work for a coefficient where the 14 lsbs were zero, so
>> only applies to 0x4000, 0x8000 and 0xC000 (assuming 0 is already special
>
> if samples are 16 or 24 bit then theres 8 or 16bit left for the matrix
> not 2

Ah, I see where you're coming from now. When output_data() comes to look
at the sample array that rematrix_channels() outputs, it ignores the top
8 bits of each 32-bit word, so we could allow rematrix_channels() to
corrupt them. (For the record, TrueHD always uses 24-bit output, and even
if you have a 16-bit non-TrueHD MLP stream, the data are assumed to be
positioned in bits 8..23.)

So, yes, you could simplify the multiplies to 32x32->32 instructions if
the (14-8)=6 lsbs of all the coefficients in one row are zero. Looking at
my example matrix to see how common that is:

/ F880, 05C0, 0000, FE40, C000, 0000 \  yes
| 08E0, F8E0, 00C0, FF80, 1040, C000 |  no
| D900, C600, C000, FD00, DB00, CF00 |  yes
| 0000, C000, D2B0, 0000, 0000, C000 |  no
\ C000, 0CD4, DBC4, 0000, C000, 0CD4 /  no

Hmm, might be frequent enough to be worthwhile after all. I thought I'd
try a better survey, and programmatically counted the relative frequency
of the first 1000000 matrix rows passing the test for a collection of
streams. It seems to be true of some streams more than others: I measured
0%, 3%, 56%, 58% and 75% on 5 different streams.

I doubt I'll be able to have a go at an ARM implementation of this within
the next few days though, so it might make sense for it to be the subject
of a later patch series (if it turns out to be a measurable improvement).

Ben