[Libav-user] Audio quality loss while encoding

Brad O'Hearne brado at bighillsoftware.com
Thu Apr 25 03:18:25 CEST 2013

On Apr 24, 2013, at 5:32 PM, Paul B Mahol <onemda at gmail.com> wrote:

> On 4/25/13, Brad O'Hearne <brado at bighillsoftware.com> wrote:
>> Here is the information on the sample buffer received from QTKit which is
>> being used to fill the source data array:
>> 2013-04-24 17:06:58.653 QTFFmpeg[2732:d407] Linear PCM, 32 bit little-endian
>> floating point, 2 channels, 44100 Hz
>> 2013-04-24 17:06:58.654 QTFFmpeg[2732:d407] Bytes per frame:  4
>> 2013-04-24 17:06:58.654 QTFFmpeg[2732:d407] Frames per packet:  1
>> 2013-04-24 17:06:58.655 QTFFmpeg[2732:d407] Bits per channel: 32
>> 2013-04-24 17:06:58.655 QTFFmpeg[2732:d407] Is packed? YES
>> 2013-04-24 17:06:58.655 QTFFmpeg[2732:d407] Is high aligned? NO
>> 2013-04-24 17:06:58.656 QTFFmpeg[2732:d407] Channels per frame: 2
>> So to answer your question, provided that I've understood it correctly,
>> samples are packed, therefore should be aligned on the full 32-bit boundary,
> Its byte and not bit.

Actually I meant bit. 1 frame, 4 bytes per frame (4 * 8 bits = 32 bits), 32 bits per channel, packed, so no offset alignment within that 32-bit (4 byte) slot. I presumed that was what you meant by 0 or 1 byte boundary. If not, let me know. 

> Another thing, you overwrite sourceData[x] pointers, causing memory leak.

Good catch...I'm aware of a few missing memory frees, as I've profiled it using Instruments. As I've been repeatedly trying various things and adding code here and there to try to fix this issue, I resolved not to worry about that stuff until the thing actually worked. But yes, you are correct. 

There is one thing I probably should clarify. In past posts, you may have heard me state that my source sample format was AV_SAMPLE_FMT_FLTP. This might have created confusion with my mention above of the captured sample buffer having a format of the following: 

>> 2013-04-24 17:06:58.653 QTFFmpeg[2732:d407] Linear PCM, 32 bit little-endian
>> floating point, 2 channels, 44100 Hz

The "Linear PCM" part might have lead you to believe I didn't have a planar format. In QTKit, the QTSampleBuffer object has a reference to the data buffer which indeed follows this format. However, it also maintains a reference to an AudioBufferList which is defined as: 

struct AudioBufferList {
   UInt32      mNumberBuffers;
   AudioBuffer mBuffers[1];
typedef struct AudioBufferList AudioBufferList;

This struct allows you to access each buffer contained within the sample, where each mBuffer contains a separate channel's data buffer, i.e. each plane of data. So if you've looked at my source code, you may notice this block of code: 

                // assign source data
                AudioBufferList *tempAudioBufferList = [sampleBuffer audioBufferListWithOptions:0];
                for (uint i = 0; i < tempAudioBufferList->mNumberBuffers; i++)
                    sourceData[i] = tempAudioBufferList->mBuffers[i].mData;

That is essentially assigning sourceData with each channel's data plane. I know that this works, because when resampling from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16 with the ADPCM_SWF codec, I get perfect audio. (Though interestingly this first go-round  I never could get the pointer to the linear buffer to work -- changing to this planar format and using the channel data references from the AudioBufferList is what fixed my previous resampling problem). 

So I just wanted to mention that, so that when you saw the "Linear PCM" mentioned, it didn't seem as if the wrong source sample format was being used. It isn't ... 


More information about the Libav-user mailing list