[Libav-user] Encoding FLOAT PCM to OGG using libav

Thu Jul 31 23:33:20 CEST 2014

On Thu, 31 Jul 2014 09:24:38 +0200
Charles-Henri DUMALIN <dumalin.ch.maillist at gmail.com> wrote:

> I am currently trying to convert a raw PCM Float buffer to an OGG encoded
> file. I tried several library to do the encoding process and I finally
> chose libavcodec.
> 
> What I precisely want to do is get the float buffer ([-1;1]) provided by my
> audio library and turn it to a char buffer of encoded ogg data.
> 
> I managed to encode the float buffer to a buffer of encoded MP2 with this
> (proof of concept) code:
> 
> static int frameEncoded;
> 
> FILE *file;
> 
> int main(int argc, char *argv[])
> {
>     file = fopen("file.ogg", "w+");
> 
>     long ret;
> 
>     avcodec_register_all();
> 
>     codec = avcodec_find_encoder(AV_CODEC_ID_MP2);
>     if (!codec) {
>         fprintf(stderr, "codec not found\n");
>         exit(1);
>     }
> 
>     c = avcodec_alloc_context3(NULL);
> 
>     c->bit_rate = 256000;
>     c->sample_rate = 44100;
>     c->channels = 2;
>     c->sample_fmt = AV_SAMPLE_FMT_S16;
>     c->channel_layout = AV_CH_LAYOUT_STEREO;
> 
>     /* open it */
>     if (avcodec_open2(c, codec, NULL) < 0) {
>         fprintf(stderr, "Could not open codec\n");
>         exit(1);
>     }
> 
> 
>     /* frame containing input raw audio */
>     frame = av_frame_alloc();
>     if (!frame) {
>         fprintf(stderr, "Could not allocate audio frame\n");
>         exit(1);
>     }
> 
>     frame->nb_samples     = c->frame_size;
>     frame->format         = c->sample_fmt;
>     frame->channel_layout = c->channel_layout;
> 
>     /* the codec gives us the frame size, in samples,
>      * we calculate the size of the samples buffer in bytes */
>     int buffer_size = av_samples_get_buffer_size(NULL, c->channels,
> c->frame_size,
>                                                  c->sample_fmt, 0);
>     if (buffer_size < 0) {
>         fprintf(stderr, "Could not get sample buffer size\n");
>         exit(1);
>     }
>     samples = av_malloc(buffer_size);
>     if (!samples) {
>         fprintf(stderr, "Could not allocate %d bytes for samples buffer\n",
>                 buffer_size);
>         exit(1);
>     }
>     /* setup the data pointers in the AVFrame */
>     ret = avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt,
>                                    (const uint8_t*)samples, buffer_size, 0);
>     if (ret < 0) {
>         fprintf(stderr, "Could not setup audio frame\n");
>         exit(1);
>     }
> }
> 
> void  myLibraryCallback(float *inbuffer, unsigned int length)
> {
>     for(int j = 0; j < (2 * length); j++) {
>         if(frameEncoded >= (c->frame_size *2)) {
>             int avret, got_output;
> 
>             av_init_packet(&pkt);
>             pkt.data = NULL; // packet data will be allocated by the encoder
>             pkt.size = 0;
> 
>             avret = avcodec_encode_audio2(c, &pkt, frame, &got_output);
>             if (avret < 0) {
>                 fprintf(stderr, "Error encoding audio frame\n");
>                 exit(1);
>             }
>             if (got_output) {
>                 fwrite(pkt.data, 1, pkt.size, file);
>                 av_free_packet(&pkt);
>             }
> 
>             frameEncoded = 0;
>         }
> 
>         samples[frameEncoded] = inbuffer[j] * SHRT_MAX;
>         frameEncoded++;
>     }
> }
> 
> 
> The code is really simple, I initialize libavencode the usual way, then my
> audio library sends me processed PCM FLOAT [-1;1] interleaved at 44.1Khz
> and the number of floats (usually 1024) in the inbuffer for each channel (2
> for stereo). So usually, inbuffer contains 2048 floats.
> 
> That was easy since I just needed here to convert my PCM to 16P, both
> interleaved. Moreover it is possible to code a 16P sample on a single char.
> 
> Now I would like to apply this to OGG which needs a sample format of
> AV_SAMPLE_FMT_FLTP. Since my native format is AV_SAMPLE_FMT_FLT, it should
> only be some desinterleaving. Which is really easy to do.
> 
> The points I don't get are:
> 
>    1. How can you send a float buffer on a char buffer ? Do we treat them
>    as-is (float* floatSamples = (float*) samples) ? If so, what means the
>    sample number avcodec gives you ? Is it the number of floats or chars ?

The sample number is the number of floats per channel. So if you have a
stereo non-interleaved (aka planar) float stream with 32 bytes per
channel (aka plane in this case), ffmpeg will think of it as 8 samples.

>    2. How can you send datas on two buffers (one for left, one for right)
>    when avcodec_fill_audio_frame only takes a (uint8_t*) parameter and not a
>    (uint8_t**) for multiple channels ? Does-it completely change the previous
>    sample code ?

I think avcodec_fill_audio_frame() should be considered legacy. The
best way is to create an AVFrame with:

frame = av_frame_alloc();
frame.format = ...;
... set other parameters ...
av_frame_get_buffer(frame, 32); // allocates frame data, using the params

And then you copy in your source data.

> I tried to find some answers myself and I made a LOT of experiments so far
> but I failed on theses points. Since there is a huge lack of documentation
> on these, I would be very grateful if you had answers.
> 
> Thank you !