[FFmpeg-devel] FLAC parser inefficiency

Thu Dec 9 19:30:34 CET 2010

On Thu, 2010-12-09 at 12:29 -0500, Justin Ruggles wrote:
> On 12/08/2010 11:22 PM, Uoti Urpala wrote:
> > BTW how is client code supposed to avoid trying to decode the junk
> > packets produced (the "/* Output a junk frame. */" part)? Normally when
> > parsing before decoding you'd want to throw away that data instead.
> > Check for avctx->frame_size being 0 after the parse call? But is that
> > supposed to behave the same way with other parsers?
> 
> 
> All parsers are supposed to output all data they receive.  The only job
> for them is to split it into frames, not filter anything out that it
> thinks are not frames.  For example, damaged frames may not pass the
> parser's test, but if it can find the next valid frame boundary, the
> frame splitting may still be correct and the decoder might be able to
> handle the damaged frame.  It's not up to the parser to decide that.

IMO that's too restricted a definition of the parser's job. Consider use
cases where you do a byte-based seek or otherwise _expect_ the parsed
stream to start at an arbitrary position which can be in the middle of a
packet. Here trying to decode the initial junk is clearly a bad
decision: it's more likely to cause problems than give any extra audio,
and even if such extra audio could be decoded it would normally be of
very little value. And even if you trust the decoder not to produce any
bad effects when fed junk, FFmpeg decoders at least _do_ produce error
messages which already makes this solution unsatisfactory.

"Turn a data stream starting at an arbitrary point into one that can be
decoded without errors" is a use case that should be supported by
parsers (at least where reasonably possible - for formats like video
with inter-frame dependencies dependency errors could be hard to
completely avoid). If the parsers fail to indicate whether the initial
packet is considered junk or not then the best available solution is
probably to always discard the first packet output by a parser. But
that's clearly not optimal (when considering what the parsers _could_
do), as it can lose valid audio when there was no junk.