[FFmpeg-devel] [PATCH] E-AC-3 spectral extension
Tue Jun 2 05:37:08 CEST 2009
Michael Niedermayer wrote:
> On Sat, May 30, 2009 at 11:43:12PM -0400, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>> On Sun, May 17, 2009 at 02:23:34PM -0400, Justin Ruggles wrote:
>>>> I was recently made aware that some French TV station(s) will soon (if
>>>> not already) start using E-AC-3 streams in their broadcasts which
>>>> utilize spectral extension. I was also given some samples (thanks j-b
>>>> and Anthony), which I uploaded to mphq:
>>>> So I decided to revisit my SPX patch. The previous version was done
>>>> with all integer arithmetic, but it turns out that it's really not
>>>> accurate enough for spectal extension processing. The resulting decoded
>>>> output had a max bandwidth of about 2kHz less when using 24-bit fixed
>>>> point vs. floating point, and was only slightly higher than without any
>>>> SPX processing at all. Making just the square roots floating point
>>>> raised the bandwidth about 1kHz, and making the rest (noise/signal
>>>> scaling, spx coords, and notch filters) floating point added about
>>>> another 1kHz.
>>>> I was able to compare the output to Nero's E-AC-3 decoder (thanks
>>>> madshi), and the results are very close considering that AC-3 uses
>>>> random noise for zero-bit mantissas:
>>>> stddev: 131.16 PSNR: 53.96
>>> i wouldnt call 131.16 close
>> Well, considering I don't know how the Nero decoder differs, it's not
>> bad. I don't know how the Nero decoder ends up with higher bandwidth
>> than it should, it very likely uses a different random noise generator,
>> and it could do dithering in the float-to-int16 conversion.
> dither in float2int might account for ~1.0 stdev maybe but we are 2
> magnitudes above that.
> about the PRNG, well just decode a AC3 with 2 different PRNGS and compare
> by how much they differ
LFG vs. MLFG
stddev: 14.99 PSNR: 72.80 bytes: 3999744/ 3999744
Nero vs. FFAC3
stddev: 19.75 PSNR: 70.41 bytes: 3998720/ 3999744
> also you can take neros output and ours and create a wav file with the
> sample wise differences.
> looking at that / listening to it might provide a hint about what is that
The difference signal is very strange. With the raw difference, all I
can hear is a little bit of soft high pitched chirping. But when it is
normalized to 0dB (increased by 31dB), 2 sections are very clear: the
word "focus" and the phrase "all these exotic". Everything else is high
pitched noise and distortion, which is similar to the difference signal
between the 2 RNGs.
So there is something odd going on. I may need to isolate those frames
and analyse them to see if there is anything obviously different about them.
In case anyone else wants to do these types of difference tests, the
easiest way I found to do it was with sox.
sox -m input0.wav -v -1 input1.wav difference.wav
More information about the ffmpeg-devel