[Libav-user] QTKit -> Libav: has it ever been done?

Wed Mar 27 22:03:39 CET 2013

On Mar 27, 2013, at 1:08 PM, Paul B Mahol <onemda at gmail.com> wrote:
> Than use AV_SAMPLE_FMT_FLTP, you do not need to manually interleave samples.
> Each channel samples are put into separate frame->data[X] where X is channel
> number starting from 0.

SHAZZZAMMM!!!!

Paul, you are brilliant. Rather than rewrite the linear sample data as interleaved I took your advice, switched to planar AV_SAMPLE_FMT_S16P, pulled out each channel's data out of the AudioBufferList and set those buffer pointers in my uint8_t **sourceData structure, as in:

            sourceData[0] = audioBufferList->mBuffers[0].mData;
            sourceData[1] = audioBufferList->mBuffers[1].mData;

The audio is perfect (sounding that is). But for getting past that wall -- I thank *everyone* on the list who replied...dialog is progressive, every comment can raise new avenues to look into. I am now scratching my head a bit as to what VLC was up to manually interleaving -- seems like unnecessary work now -- but I suppose they have a common interleaved format QTKit captures need to be converted to for universal downstream libVLC processing.

Unfortunately, I cannot pop the champagne corks and blow off the fireworks quite yet. While the audio sounds great, the video timing is not aligned with the audio, and the video now freezes a short way in. While I enjoy the occasional Kung Fu Theater movie (probably dating myself a little there, that's a reference to Saturday afternoon English-dubbed karate movies as a kid), I'm guessing my customers won't find the humor. 

I am posting a link to the encoded FLV file: 

https://www.dropbox.com/s/wsol1pd9vv3adrz/Output.flv

If any of you experts could lend guidance to how to iron out these timing issues, I would be much obliged. My initial hunch is the setting of the AVPacket dts and pts values and/or use of av_interleaved_write_frame vs av_write_frameI posted another message a while back about this which was never replied to, but I'm not completely clear as to when to use either. I've also had interesting results using each with only video, only audio, or both video and audio, so unless someone just happens to know off the top of their head what the problem is, I'll create videos for each scenario and pursue that discussion in a different thread, as this one has gotten lengthy, and the topic is somewhat shifting. 

But to conclude this thread -- a tremendous thank you to everyone who has contributed to the discussion. 

Cheers, 

Brad