<div dir="ltr"><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
I am currently trying to convert a raw PCM Float buffer to an OGG encoded file. I tried several library to do the encoding process and I finally chose libavcodec.</p><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
What I precisely want to do is get the float buffer ([-1;1]) provided by my audio library and turn it to a char buffer of encoded ogg data.</p><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
I managed to encode the float buffer to a buffer of encoded MP2 with this (proof of concept) code:</p><pre style="margin-top:0px;margin-bottom:10px;padding:5px;border:0px;font-size:14px;vertical-align:baseline;font-family:Consolas,Menlo,Monaco,'Lucida Console','Liberation Mono','DejaVu Sans Mono','Bitstream Vera Sans Mono','Courier New',monospace,serif;overflow:auto;width:auto;max-height:600px;word-wrap:normal;color:rgb(0,0,0);line-height:17.804800033569336px;background:rgb(238,238,238)">
<code style="margin:0px;padding:0px;border:0px;vertical-align:baseline;font-family:Consolas,Menlo,Monaco,'Lucida Console','Liberation Mono','DejaVu Sans Mono','Bitstream Vera Sans Mono','Courier New',monospace,serif;white-space:inherit;background-image:initial;background-repeat:initial">static int frameEncoded;
FILE *file;
int main(int argc, char *argv[])
{
file = fopen("file.ogg", "w+");
long ret;
avcodec_register_all();
codec = avcodec_find_encoder(AV_CODEC_ID_MP2);
if (!codec) {
fprintf(stderr, "codec not found\n");
exit(1);
}
c = avcodec_alloc_context3(NULL);
c->bit_rate = 256000;
c->sample_rate = 44100;
c->channels = 2;
c->sample_fmt = AV_SAMPLE_FMT_S16;
c->channel_layout = AV_CH_LAYOUT_STEREO;
/* open it */
if (avcodec_open2(c, codec, NULL) < 0) {
fprintf(stderr, "Could not open codec\n");
exit(1);
}
/* frame containing input raw audio */
frame = av_frame_alloc();
if (!frame) {
fprintf(stderr, "Could not allocate audio frame\n");
exit(1);
}
frame->nb_samples = c->frame_size;
frame->format = c->sample_fmt;
frame->channel_layout = c->channel_layout;
/* the codec gives us the frame size, in samples,
* we calculate the size of the samples buffer in bytes */
int buffer_size = av_samples_get_buffer_size(NULL, c->channels, c->frame_size,
c->sample_fmt, 0);
if (buffer_size < 0) {
fprintf(stderr, "Could not get sample buffer size\n");
exit(1);
}
samples = av_malloc(buffer_size);
if (!samples) {
fprintf(stderr, "Could not allocate %d bytes for samples buffer\n",
buffer_size);
exit(1);
}
/* setup the data pointers in the AVFrame */
ret = avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt,
(const uint8_t*)samples, buffer_size, 0);
if (ret < 0) {
fprintf(stderr, "Could not setup audio frame\n");
exit(1);
}
}
void myLibraryCallback(float *inbuffer, unsigned int length)
{
for(int j = 0; j < (2 * length); j++) {
if(frameEncoded >= (c->frame_size *2)) {
int avret, got_output;
av_init_packet(&pkt);
pkt.data = NULL; // packet data will be allocated by the encoder
pkt.size = 0;
avret = avcodec_encode_audio2(c, &pkt, frame, &got_output);
if (avret < 0) {
fprintf(stderr, "Error encoding audio frame\n");
exit(1);
}
if (got_output) {
fwrite(pkt.data, 1, pkt.size, file);
av_free_packet(&pkt);
}
frameEncoded = 0;
}
samples[frameEncoded] = inbuffer[j] * SHRT_MAX;
frameEncoded++;
}
}</code></pre><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
<br></p><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
The code is really simple, I initialize libavencode the usual way, then my audio library sends me processed PCM FLOAT [-1;1] interleaved at 44.1Khz and the number of floats (usually 1024) in the inbuffer for each channel (2 for stereo). So usually, inbuffer contains 2048 floats.</p>
<p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
That was easy since I just needed here to convert my PCM to 16P, both interleaved. Moreover it is possible to code a 16P sample on a single char.</p><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
Now I would like to apply this to OGG which needs a sample format of AV_SAMPLE_FMT_FLTP. Since my native format is AV_SAMPLE_FMT_FLT, it should only be some desinterleaving. Which is really easy to do.</p><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
The points I don't get are:</p><ol style="margin:0px 0px 1em 30px;padding:0px;border:0px;font-size:14px;vertical-align:baseline;list-style-position:initial;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
<li style="margin:0px;padding:0px;border:0px;vertical-align:baseline;background:transparent">How can you send a float buffer on a char buffer ? Do we treat them as-is (float* floatSamples = (float*) samples) ? If so, what means the sample number avcodec gives you ? Is it the number of floats or chars ?</li>
<li style="margin:0px;padding:0px;border:0px;vertical-align:baseline;background:transparent">How can you send datas on two buffers (one for left, one for right) when avcodec_fill_audio_frame only takes a (uint8_t*) parameter and not a (uint8_t**) for multiple channels ? Does-it completely change the previous sample code ?</li>
</ol><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
I tried to find some answers myself and I made a LOT of experiments so far but I failed on theses points. Since there is a huge lack of documentation on these, I would be very grateful if you had answers.</p><p style="margin:0px 0px 1em;padding:0px;border:0px;font-size:14px;vertical-align:baseline;clear:both;color:rgb(0,0,0);font-family:Arial,'Liberation Sans','DejaVu Sans',sans-serif;line-height:17.804800033569336px;background-image:initial;background-repeat:initial">
Thank you !</p></div>