[Libav-user] Resample frame to specified number of samples
shekh.anton at gmail.com
Thu Jul 20 23:17:25 EEST 2017
2017-07-20 22:48 GMT+03:00 Kerry Loux <louxkr at gmail.com>:
> On Thu, Jul 20, 2017 at 1:19 PM, Anton Shekhovtsov <shekh.anton at gmail.com>
>> 2017-07-19 20:54 GMT+03:00 Kerry Loux <louxkr at gmail.com>:
>>> Hello all,
>>> I have an application where I am opening an audio file that was sampled
>>> at 44100 Hz, decoding it, resampling to 16000 Hz, encoding it again (AAC)
>>> then broadcasting it on an RTSP stream. On the receiving end, I decode the
>>> incoming AAC packets and render them.
>>> The rendered audio is very slow.
>>> It appears to me that the problem is related to the AVFrame.nb_samples
>>> field. When I read a packet from file (using av_read_frame()), the packet
>>> size is 1024 samples (at 44100 Hz). After I resample to 16000 Hz, I have
>>> ~1/3 the samples that I had in the original frame (as expected). Then, the
>>> frame gets encoded, streamed and decoded. After decoding, the
>>> AVFrame.nb_samples is 1024 when I expect it to be 372 or so. The
>>> AVCodecContext passed to avcodec_receive_frame() has frame_size = 1024, so
>>> I assume that the decoder is setting the number of samples of the decoded
>>> frame to 1024 regardless of the number of samples actually contained in the
>>> input packet? Or maybe it's my job to ensure that the input packets always
>>> contain 1024 samples?
>>> I'm not entirely sure what's going on. My thoughts include:
>>> - Try buffering 3x number of input frames prior to resampling so the
>>> resulting frame will be ~1024 samples
>>> - Calculate the number of samples manually (how to do this is unclear)
>>> and override the number of samples assigned by the decoder (this seems
>>> Any recommendations? Can I just stick multiple frames together in a
>>> larger buffer prior to resampling (i.e. calling swr_convert())?
>>> Libav-user mailing list
>>> Libav-user at ffmpeg.org
>> Try to study examples (resampling_audio, transcoding_audio, don't
>> remember which is most relevant).
>> You are not supposed to resample individual frames. You must feed it
>> continuously. AFAIK this is clearly explained in swr docs.
>> AAC wants packets of fixed size (1024).
>> Libav-user mailing list
>> Libav-user at ffmpeg.org
> Yes, I am feeding it continuously. I am doing this:
> AVPacket* ADTSEncoderInterface::EncodeAudio(const AVFrame& inputFrame)
> if (avcodec_send_frame(encoderContext, &inputFrame) != 0)
> return nullptr;
> AVPacket* lastOutputPacket, *nextOutputPacket(nullptr);
> bool nextPacketIsA(true);
> int returnCode;
> lastOutputPacket = nextOutputPacket;
> nextPacketIsA = !nextPacketIsA;
> if (nextPacketIsA)
> nextOutputPacket = &outputPacketA;
> nextOutputPacket = &outputPacketB;
> returnCode = avcodec_receive_packet(encoderContext, nextOutputPacket);
> } while (returnCode == 0);
> if (returnCode != AVERROR(EAGAIN) || !lastOutputPacket)
> return nullptr;
> return lastOutputPacket;
> I assumed (possibly incorrectly) that if AAC requires packets containing
> 1024 samples, that I would get AVERROR(EAGAIN) returned from
> avcodec_receive_packet() if there were not enough input samples available.
> It seems that this is not the case, however, instead I need to do something
> myself in order to ensure the encoder has at least 1024 samples before I
> call avcodec_receive_packet().
> I haven't found anything in the documentation to suggest that it is the
> callers responsibility to do this. Maybe this wouldn't be found in FFmpeg
> docs, but in documentation describing the AAC format? If that were the
> case, it may have been helpful if the call to avcodec_send_frame() failed
> with some kind of "wrong number of input samples" error.
> I did find a solution, although it seems rather inefficient. I introduced
> an additional AVFrame object, fullSizeFrame, and prior to calling the
> encoder (my EncodeAudio method pasted above), I do this:
> while (fullSizeFrame->nb_samples < packetSampleCount)// packetSampleCount
> == 1024
> nextFrame = dataQueue.front();
> if (!nextFrame)
> const int samplesToCopy(std::min(packetSampleCount -
> fullSizeFrame->nb_samples, nextFrame->nb_samples));
> memcpy(fullSizeFrame->data + fullSizeFrame->nb_samples * sampleSize,
> nextFrame->data, samplesToCopy * sampleSize);
> fullSizeFrame->nb_samples += samplesToCopy;
> pendingSamples -= samplesToCopy;
> if (samplesToCopy == nextFrame->nb_samples)
> memmove(nextFrame->data, nextFrame->data + samplesToCopy *
> sampleSize, (nextFrame->nb_samples - samplesToCopy) * sampleSize);
> nextFrame->nb_samples -= samplesToCopy;
> Thanks for your help.
> Libav-user mailing list
> Libav-user at ffmpeg.org
I am not ffmpeg expert by any means but I was able to figure these details
Look at encode_audio.c
frame->nb_samples = c->frame_size;
this should give some idea. frame_size is indeed 1024 for AAC.
My comment about "feed it continuously" was about calling swr_convert.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Libav-user