[Libav-user] Resample frame to specified number of samples

Kerry Loux louxkr at gmail.com
Thu Jul 20 22:48:51 EEST 2017

On Thu, Jul 20, 2017 at 1:19 PM, Anton Shekhovtsov <shekh.anton at gmail.com>

> 2017-07-19 20:54 GMT+03:00 Kerry Loux <louxkr at gmail.com>:
>> Hello all,
>> I have an application where I am opening an audio file that was sampled
>> at 44100 Hz, decoding it, resampling to 16000 Hz, encoding it again (AAC)
>> then broadcasting it on an RTSP stream.  On the receiving end, I decode the
>> incoming AAC packets and render them.
>> The rendered audio is very slow.
>> It appears to me that the problem is related to the AVFrame.nb_samples
>> field.  When I read a packet from file (using av_read_frame()), the packet
>> size is 1024 samples (at 44100 Hz).  After I resample to 16000 Hz, I have
>> ~1/3 the samples that I had in the original frame (as expected).  Then, the
>> frame gets encoded, streamed and decoded.  After decoding, the
>> AVFrame.nb_samples is 1024 when I expect it to be 372 or so.  The
>> AVCodecContext passed to avcodec_receive_frame() has frame_size = 1024, so
>> I assume that the decoder is setting the number of samples of the decoded
>> frame to 1024 regardless of the number of samples actually contained in the
>> input packet?  Or maybe it's my job to ensure that the input packets always
>> contain 1024 samples?
>> I'm not entirely sure what's going on.  My thoughts include:
>> - Try buffering 3x number of input frames prior to resampling so the
>> resulting frame will be ~1024 samples
>> - Calculate the number of samples manually (how to do this is unclear)
>> and override the number of samples assigned by the decoder (this seems
>> wrong...)
>> Any recommendations?  Can I just stick multiple frames together in a
>> larger buffer prior to resampling (i.e. calling swr_convert())?
>> Thanks,
>> Kerry
>> _______________________________________________
>> Libav-user mailing list
>> Libav-user at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/libav-user
> Try to study examples (resampling_audio, transcoding_audio, don't remember
> which is most relevant).
> You are not supposed to resample individual frames. You must feed it
> continuously. AFAIK this is clearly explained in swr docs.
> AAC wants packets of fixed size (1024).
> _______________________________________________
> Libav-user mailing list
> Libav-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/libav-user
Yes, I am feeding it continuously.  I am doing this:

AVPacket* ADTSEncoderInterface::EncodeAudio(const AVFrame& inputFrame)
if (avcodec_send_frame(encoderContext, &inputFrame) != 0)
return nullptr;

AVPacket* lastOutputPacket, *nextOutputPacket(nullptr);
bool nextPacketIsA(true);
int returnCode;
lastOutputPacket = nextOutputPacket;
nextPacketIsA = !nextPacketIsA;
if (nextPacketIsA)
nextOutputPacket = &outputPacketA;
nextOutputPacket = &outputPacketB;

returnCode = avcodec_receive_packet(encoderContext, nextOutputPacket);
} while (returnCode == 0);

if (returnCode != AVERROR(EAGAIN) || !lastOutputPacket)
return nullptr;

return lastOutputPacket;

I assumed (possibly incorrectly) that if AAC requires packets containing
1024 samples, that I would get AVERROR(EAGAIN) returned from
avcodec_receive_packet() if there were not enough input samples available.
It seems that this is not the case, however, instead I need to do something
myself in order to ensure the encoder has at least 1024 samples before I
call avcodec_receive_packet().

I haven't found anything in the documentation to suggest that it is the
callers responsibility to do this.  Maybe this wouldn't be found in FFmpeg
docs, but in documentation describing the AAC format?  If that were the
case, it may have been helpful if the call to avcodec_send_frame() failed
with some kind of "wrong number of input samples" error.

I did find a solution, although it seems rather inefficient.  I introduced
an additional AVFrame object, fullSizeFrame, and prior to calling the
encoder (my EncodeAudio method pasted above), I do this:

while (fullSizeFrame->nb_samples < packetSampleCount)// packetSampleCount
== 1024
nextFrame = dataQueue.front();
if (!nextFrame)

const int samplesToCopy(std::min(packetSampleCount -
fullSizeFrame->nb_samples, nextFrame->nb_samples));
memcpy(fullSizeFrame->data[0] + fullSizeFrame->nb_samples * sampleSize,
nextFrame->data[0], samplesToCopy * sampleSize);
fullSizeFrame->nb_samples += samplesToCopy;
pendingSamples -= samplesToCopy;

if (samplesToCopy == nextFrame->nb_samples)
memmove(nextFrame->data[0], nextFrame->data[0] + samplesToCopy *
sampleSize, (nextFrame->nb_samples - samplesToCopy) * sampleSize);
nextFrame->nb_samples -= samplesToCopy;

Thanks for your help.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ffmpeg.org/pipermail/libav-user/attachments/20170720/670d208d/attachment.html>

More information about the Libav-user mailing list