[FFmpeg-devel] [PATCH] Set correct frame_size for Speex decoding

Michael Niedermayer michaelni
Sun Aug 16 01:38:40 CEST 2009

On Sat, Aug 15, 2009 at 07:20:29PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> > On Sat, Aug 15, 2009 at 01:30:10PM -0400, Justin Ruggles wrote:
> >> Justin Ruggles wrote:
> >>
> >>> Michael Niedermayer wrote:
> >>>> On Sat, Aug 01, 2009 at 06:53:11AM -0400, Justin Ruggles wrote:
> >>>>> Michael Niedermayer wrote:
> >>>>>> On Fri, Jul 31, 2009 at 07:34:00PM -0400, Justin Ruggles wrote:
> >>>>>>> Michael Niedermayer wrote:
> >>>>>>>> On Fri, Jul 31, 2009 at 06:48:59PM -0400, Justin Ruggles wrote:
> >>>>>>>>> Michael Niedermayer wrote:
> >>>>>>>>>> On Thu, Jul 30, 2009 at 07:25:17PM -0400, Justin Ruggles wrote:
> >>>>>>>>>>> Justin Ruggles wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Currently AVCodecContext.frame_size is not set correctly for Speex.
> >>>>>>>>>>>> Since the Ogg and FLV demuxers and the libspeex decoder handle a full
> >>>>>>>>>>>> packet as a single frame, frame_size should be set to the Speex
> >>>>>>>>>>>> frame_size * frames_per_packet.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If frames_per_packet is not specified in the Speex header, or if there
> >>>>>>>>>>>> is no header, it can be determined after decoding the first packet.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Stream copy is not implemented yet for Speex, but once it is, a parser
> >>>>>>>>>>>> will be able to set all the stream parameters instead of the decoder
> >>>>>>>>>>>> when the header is missing or incomplete.
> >>>>>>>>>>> ping.
> >>>>>>>>>> it might be helpfull if you say who you expect to review this
> >>>>>>>>> In general, I'm looking for an ok on having lavf and lavc treat a whole
> >>>>>>>>> Speex packet as a single frame.  After considering and trying to code
> >>>>>>>>> the split-then-join idea it did not seem like a very clean solution, and
> >>>>>>>>> it is not really necessary.  This is my general plan:
> >>>>>>>> which values of frames_per_packet does each container allow?
> >>>>>>>> that is each container that supports speex
> >>>>>>> AFAIK, Ogg is only limited by the Speex header, which supports up to
> >>>>>>> INT32_MAX. (libspeex reads the 4-byte value in the header as a signed int32)
> >>>>>>>
> >>>>>>> For FLV, it's unclear.  Here is a quote from Art:
> >>>>>>>> Here's what I found.  Set the speex frames per packet all the way from 1 up
> >>>>>>>> to 8, and it appears they all now work with Flash Player (I erroneously
> >>>>>>>> reported that 1 would not work before, but at least with the latest version
> >>>>>>>> that is not the case).  Setting 9 frames per packet causes flash player to
> >>>>>>>> start stuttering.  Set 10 or more frames per packet causes flash player to
> >>>>>>>> crash, bringing down the browser with it.
> >>>>>>> The speexenc commandline program allows encoding between 1 and 10 frames
> >>>>>>> per packet.
> >>>>>> so if you have a ogg with 11 -acodec copy to flv will not work, IMO thats a
> >>>>>> bug. How should that be fixed?
> >>>>>> if the parser splits the packets as it should, it should work ...
> >>>>> -acodec copy from Ogg to FLV with 11 frames per packet would work just
> >>>>> fine in FFmpeg, but Flash Player would crash when playing the file.
> >>>>> That seems like a bug in Flash Player to me.
> >>>> sure a crash is a bug but with propriatary formats the official tools are
> >>>> more or less what define the format. One can argue about it of course but
> >>>> de facto ffmpeg generating flv files that flash player cant play is a
> >>>> disadvantage for ffmpeg and would be preceived as a bug in ffmpeg by most.
> >>>> also even if fixed in the next version there still will be many people
> >>>> around who use the old player that doesnt handle it, ffmpeg definitly
> >>>> should not create files that are known to be problematic to 99% of the
> >>>> used players, at least not by default.
> >>> Well, we could not allow creation of FLV files with more than 8 frames
> >>> per packet.  The only downside would be that stream copy would fail when
> >>> the source has more than 8 frames per packet and the destination is FLV.
> > 
> > its not the only downside
> > having a random number of frames packed into one frame (thats actually
> > alraedy odd) means the internal format of speex is pretty much undefined.
> > what meaning do frame size or duration variables have then?
> It's not random.  The header defines how many frames (and also how many
> samples) are in each packet.  In the absence of a header, a parser can
> determine the same information easily.  It is constant for the entire
> stream.
> If a whole speex packet is treated as a single frame, it would behave
> just the same as other codecs.  For FLV there would be a limit on the
> number of samples in a frame, but for other containers there would not be.

i dont think the speex decoder will succeed as treating it like that

> > what exactly is the argument you have that speex should not be handled like
> > every other codec?!
> > split it in a parser, the muxer muxes ONLY a single speex packet per
> > container packet. Any extension from that is low priority and "patch welcome"
> > IMHO ...
> The downside for Speex is the container overhead since individual frames
> are very small.

this is true for many (most?) speech codecs

also if we write our own speex encoder, it will only return one frame at a

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

it is not once nor twice but times without number that the same ideas make
their appearance in the world. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090816/28882fe3/attachment.pgp>

More information about the ffmpeg-devel mailing list