[FFmpeg-devel] [PATCH] speex in ogg muxer

Sun Jun 28 21:43:28 CEST 2009

Baptiste Coudurier wrote:
> Justin Ruggles wrote:
>> Baptiste Coudurier wrote:
>>> Justin Ruggles wrote:
>>>> Baptiste Coudurier wrote:
>>>>> Justin Ruggles wrote:
>>>>>> Baptiste Coudurier wrote:
>>>>>>> Justin Ruggles wrote:
>>>>>>>> Baptiste Coudurier wrote:
>>>>>>>>> Hi Justin,
>>>>>>>>>
>>>>>>>>> Justin Ruggles wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This patch adds speex support to the ogg muxer.  It basically does the
>>>>>>>>>> same thing as Ogg/FLAC, in that the 1st packet is a global header from
>>>>>>>>>> extradata and the 2nd packet is vorbiscomment metadata.
>>>>>>>>>>
>>>>>>>>>> This seems to work just fine for speex-to-speex stream copy, but
>>>>>>>>>> probably would not work for flv-to-speex because flv doesn't to have any
>>>>>>>>>> speex extradata from what I can tell.  I guess a header could be
>>>>>>>>>> constructed, but that would be a separate patch to the flv demuxer.
>>>>>>>>>>
>>>>>>>>>> This patch is a precursor to libspeex encoding support, which I'll be
>>>>>>>>>> sending shortly.
>>>>>>>>>>
>>>>>>>>>> -Justin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>> Index: libavformat/oggenc.c
>>>>>>>>>> ===================================================================
>>>>>>>>>> --- libavformat/oggenc.c	(revision 19244)
>>>>>>>>>> +++ libavformat/oggenc.c	(working copy)
>>>>>>>>>> @@ -104,17 +125,39 @@
>>>>>>>>>>      bytestream_put_byte(&p, 0x00); // streaminfo
>>>>>>>>>>      bytestream_put_be24(&p, 34);
>>>>>>>>>>      bytestream_put_buffer(&p, streaminfo, FLAC_STREAMINFO_SIZE);
>>>>>>>>>> -    oggstream->header_len[1] = 1+3+4+strlen(vendor)+4;
>>>>>>>>>> -    oggstream->header[1] = av_mallocz(oggstream->header_len[1]);
>>>>>>>>>> -    p = oggstream->header[1];
>>>>>>>>>> +    p = ogg_write_vorbiscomment(4, bitexact, &oggstream->header_len[1]);
>>>>>>>>>> +    if (!p)
>>>>>>>>>> +        return -1;
>>>>>>>>> AVERROR(ENOMEM)
>>>>>>>> fixed.
>>>>>>>>
>>>>>>>>>> @@ -144,6 +188,12 @@
>>>>>>>>>>                  av_log(s, AV_LOG_ERROR, "Extradata corrupted\n");
>>>>>>>>>>                  av_freep(&st->priv_data);
>>>>>>>>>>              }
>>>>>>>>>> +        } else if (st->codec->codec_id == CODEC_ID_SPEEX) {
>>>>>>>>>> +            if (ogg_build_speex_headers(st->codec, oggstream,
>>>>>>>>>> +                                        st->codec->flags & CODEC_FLAG_BITEXACT) < 0) {
>>>>>>>>>> +                av_log(s, AV_LOG_ERROR, "error writing Speex headers\n");
>>>>>>>>>> +                av_freep(&st->priv_data);
>>>>>>>>>> +            }
>>>>>>>>> return error here with the return code of the func :>
>>>>>>>>> Yes, it seems flac miss it too, this needs a fix.
>>>>>>>>>
>>>>>>>>> patch fine otherwise, maybe a micro bump for avformat would be nice.
>>>>>>>> fixed. new patch attached. the new patch also differs in that it
>>>>>>>> overrides the extra_headers field in the Speex header to be 0 since only
>>>>>>>> the 2 required headers are written.
>>>>>>>>
>>>>>>> patch ok if it works :>
>>>>>> Hmm... I've done several more tests and it does not quite work as-is for
>>>>>> all samples.  Here is what I have run into.  The tests so far are for
>>>>>> ogg-to-ogg stream copy.
>>>>>>
>>>>>> - When the source has more than 1 frame per packet, the resulting copy
>>>>>> plays fine with ffmpeg/ffplay but is quick and choppy with speexdec.  I
>>>>>> was able to fix this by modifying the ogg/speex demuxer to set
>>>>>> avctx->frame_size to the number of samples in a packet instead of in a
>>>>>> frame.  I also had to update the libspeex decoder accordingly.  Maybe
>>>>>> this is the wrong way to go about it though.  I'm guessing it is a
>>>>>> timestamp/granulepos issue, but I don't know enough about Ogg to tell
>>>>>> more than that.
>>>>> Sounds like a parser is needed.
>>>> ah, indeed. that shouldn't be too difficult. i'll give it a try.
>>>>
>>>> once the frames are separated into individual AVPackets, what should i
>>>> do in the ogg muxer if i want it to put multiple frames in a single ogg
>>>> packet?
>>> No you don't :)
>> That's not good.  Speex supports it for a reason.  With 1 frame per
>> packet there is something like 30% container overhead.
> 
> Humm, well, I guess your problem is to set pkt->pts correctly in the
> encoder and make sure pkt->duration is correctly set.

One of the issues here is that speex frames do not have to end on a byte
boundary.  When they are grouped together in a packet, the whole packet
is padded at the end with a terminator code to end on a byte boundary.
So a parser would not be able to split frames unless it adds padding to
each frame, which I don't think is desirable.  So the best solution
seems to be to treat the group of frames as a single frame and set the
duration correctly.

In order to set pkt->duration correctly, one needs to know how many
speex frames are in the packet.  This can be found in the extradata if
there is any, but if not, then it could still be found.  My first
thought is a parser that does not split or merge, but only analyzes the
speex frames in the packet and sets avctx->frame_size accordingly.  This
parser would only be used by demuxers that do not have speex extradata
(e.g. flv).

Does that sound like a good place to start?

-Justin