[FFmpeg-devel] [PATCH] FLAC parser
Sun Mar 29 04:24:04 CEST 2009
Michael Niedermayer wrote:
> On Sat, Mar 28, 2009 at 06:41:02PM -0400, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>> On Fri, Mar 27, 2009 at 01:05:47AM -0400, Justin Ruggles wrote:
>>>> I finally got a working FLAC parser without resorting to buffering
>>>> max_frame_size bytes like the FLAC decoder does. It requires a slight
>>>> change to ff_combine_frame() since the header can be up to 16 bytes long
>>>> and ff_combine_frame() currently only supports up to 8 bytes of overread
>>>> data (FF_INPUT_BUFFER_PADDING_SIZE).
>>>> This works with all samples I've tested, but it would be great to have
>>>> more tested as well. There are quite a few corner cases, and while I've
>>>> tried to think of everything I can, I might have missed something.
>>> If i understand this correctly,
>>> this is a probabilistic parser, that is it will fail once in 4tb of
>>> random data at least, but due to the max crc stuff sooner.
>>> and as data is not random it could fail more frequently
>> Yes. The only alternatives I can see would be decoding twice (although
>> inverse prediction, and interleaving could be skipped in the parser) or
>> not having a parser, which prevents stream copy to most other containers.
> You can do better without decoding twice
> first find all valid headers (sync + header crc + a few bits in the header)
When you say "all valid headers" do you mean to buffer the whole file?
Would it be ok to check sequences in a fixed number of headers or a
fixed number of bytes (this would have to be like pretty large to be useful)
> that would give you something like:
> H H H H H H
> then calculate the frame crc between each 2 headers
> H H H H H H
> Then find the longest connected chain of valid CRCs, if there is another
> non overlapping chain of valid crcs before it pick it instead.
> Return the first frame from the chain.
I just want to make sure I understand you correctly.
H0-->H2 = 1
H1-->H3-->H4-->H5 = 3
H2 = 0
H3-->H4-->H5 = 2
H4-->H5 = 1
This would select the chain starting with H1, and the data between the
start of the buffer and H1 should be returned in this case.
Then the CRCs starting at H0 can be tossed out, and after the next
header is found, calculate 2 new CRCs, H4-->H6 and H5-->H6 and
re-evaluate the sequences.
> also you should favor chains which dont change samplerate or other such
Do you mean to pick a sequence that doesn't change values over an
earlier sequence of the same size that does change values? Or something
with more weight than that?
More information about the ffmpeg-devel