[Libav-user] Resample frame to specified number of samples

Thu Jul 20 23:17:25 EEST 2017

2017-07-20 22:48 GMT+03:00 Kerry Loux <louxkr at gmail.com>:

>
> On Thu, Jul 20, 2017 at 1:19 PM, Anton Shekhovtsov <shekh.anton at gmail.com>
> wrote:
>
>>
>>
>> 2017-07-19 20:54 GMT+03:00 Kerry Loux <louxkr at gmail.com>:
>>
>>> Hello all,
>>>
>>> I have an application where I am opening an audio file that was sampled
>>> at 44100 Hz, decoding it, resampling to 16000 Hz, encoding it again (AAC)
>>> then broadcasting it on an RTSP stream.  On the receiving end, I decode the
>>> incoming AAC packets and render them.
>>>
>>> The rendered audio is very slow.
>>>
>>> It appears to me that the problem is related to the AVFrame.nb_samples
>>> field.  When I read a packet from file (using av_read_frame()), the packet
>>> size is 1024 samples (at 44100 Hz).  After I resample to 16000 Hz, I have
>>> ~1/3 the samples that I had in the original frame (as expected).  Then, the
>>> frame gets encoded, streamed and decoded.  After decoding, the
>>> AVFrame.nb_samples is 1024 when I expect it to be 372 or so.  The
>>> AVCodecContext passed to avcodec_receive_frame() has frame_size = 1024, so
>>> I assume that the decoder is setting the number of samples of the decoded
>>> frame to 1024 regardless of the number of samples actually contained in the
>>> input packet?  Or maybe it's my job to ensure that the input packets always
>>> contain 1024 samples?
>>>
>>> I'm not entirely sure what's going on.  My thoughts include:
>>> - Try buffering 3x number of input frames prior to resampling so the
>>> resulting frame will be ~1024 samples
>>> - Calculate the number of samples manually (how to do this is unclear)
>>> and override the number of samples assigned by the decoder (this seems
>>> wrong...)
>>>
>>> Any recommendations?  Can I just stick multiple frames together in a
>>> larger buffer prior to resampling (i.e. calling swr_convert())?
>>>
>>> Thanks,
>>>
>>> Kerry
>>>
>>> _______________________________________________
>>> Libav-user mailing list
>>> Libav-user at ffmpeg.org
>>> http://ffmpeg.org/mailman/listinfo/libav-user
>>>
>>>
>> Try to study examples (resampling_audio, transcoding_audio, don't
>> remember which is most relevant).
>> You are not supposed to resample individual frames. You must feed it
>> continuously. AFAIK this is clearly explained in swr docs.
>> AAC wants packets of fixed size (1024).
>>
>>
>> _______________________________________________
>> Libav-user mailing list
>> Libav-user at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/libav-user
>>
>>
> Yes, I am feeding it continuously.  I am doing this:
>
> AVPacket* ADTSEncoderInterface::EncodeAudio(const AVFrame& inputFrame)
> {
> if (avcodec_send_frame(encoderContext, &inputFrame) != 0)
> return nullptr;
>
> AVPacket* lastOutputPacket, *nextOutputPacket(nullptr);
> bool nextPacketIsA(true);
> int returnCode;
> do
> {
> lastOutputPacket = nextOutputPacket;
> nextPacketIsA = !nextPacketIsA;
> if (nextPacketIsA)
> nextOutputPacket = &outputPacketA;
> else
> nextOutputPacket = &outputPacketB;
>
> returnCode = avcodec_receive_packet(encoderContext, nextOutputPacket);
> } while (returnCode == 0);
>
> if (returnCode != AVERROR(EAGAIN) || !lastOutputPacket)
> return nullptr;
>
> return lastOutputPacket;
> }
>
> I assumed (possibly incorrectly) that if AAC requires packets containing
> 1024 samples, that I would get AVERROR(EAGAIN) returned from
> avcodec_receive_packet() if there were not enough input samples available.
> It seems that this is not the case, however, instead I need to do something
> myself in order to ensure the encoder has at least 1024 samples before I
> call avcodec_receive_packet().
>
> I haven't found anything in the documentation to suggest that it is the
> callers responsibility to do this.  Maybe this wouldn't be found in FFmpeg
> docs, but in documentation describing the AAC format?  If that were the
> case, it may have been helpful if the call to avcodec_send_frame() failed
> with some kind of "wrong number of input samples" error.
>
> I did find a solution, although it seems rather inefficient.  I introduced
> an additional AVFrame object, fullSizeFrame, and prior to calling the
> encoder (my EncodeAudio method pasted above), I do this:
>
> while (fullSizeFrame->nb_samples < packetSampleCount)// packetSampleCount
> == 1024
> {
> assert(!dataQueue.empty());
> nextFrame = dataQueue.front();
> if (!nextFrame)
> continue;
>
> const int samplesToCopy(std::min(packetSampleCount -
> fullSizeFrame->nb_samples, nextFrame->nb_samples));
> memcpy(fullSizeFrame->data[0] + fullSizeFrame->nb_samples * sampleSize,
> nextFrame->data[0], samplesToCopy * sampleSize);
> fullSizeFrame->nb_samples += samplesToCopy;
> pendingSamples -= samplesToCopy;
>
> if (samplesToCopy == nextFrame->nb_samples)
> {
> dataQueue.pop();
> av_frame_free(&nextFrame);
> }
> else
> {
> memmove(nextFrame->data[0], nextFrame->data[0] + samplesToCopy *
> sampleSize, (nextFrame->nb_samples - samplesToCopy) * sampleSize);
> nextFrame->nb_samples -= samplesToCopy;
> }
> }
>
> Thanks for your help.
>
> -Kerry
>
> _______________________________________________
> Libav-user mailing list
> Libav-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/libav-user
>
>
I am not ffmpeg expert by any means but I was able to figure these details
somehow
Look at encode_audio.c
...
    frame->nb_samples     = c->frame_size;
...
this should give some idea. frame_size is indeed 1024 for AAC.

My comment about "feed it continuously" was about calling swr_convert.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ffmpeg.org/pipermail/libav-user/attachments/20170720/abee1374/attachment.html>