[FFmpeg-devel] [PATCH] FLAC parser

Sun Mar 29 03:00:48 CEST 2009

Michael Niedermayer wrote:
> On Sat, Mar 28, 2009 at 06:41:02PM -0400, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>> On Fri, Mar 27, 2009 at 01:05:47AM -0400, Justin Ruggles wrote:
>>>> Hi,
>>>>
>>>> I finally got a working FLAC parser without resorting to buffering
>>>> max_frame_size bytes like the FLAC decoder does.  It requires a slight
>>>> change to ff_combine_frame() since the header can be up to 16 bytes long
>>>> and ff_combine_frame() currently only supports up to 8 bytes of overread
>>>> data (FF_INPUT_BUFFER_PADDING_SIZE).
>>>>
>>>> This works with all samples I've tested, but it would be great to have
>>>> more tested as well.  There are quite a few corner cases, and while I've
>>>> tried to think of everything I can, I might have missed something.
>>> If i understand this correctly,
>>> this is a probabilistic parser, that is it will fail once in 4tb of
>>> random data at least, but due to the max crc stuff sooner.
>>> and as data is not random it could fail more frequently
>> Yes. The only alternatives I can see would be decoding twice (although
>> inverse prediction, and interleaving could be skipped in the parser) or
>> not having a parser, which prevents stream copy to most other containers.
>>
>> I came up with a different probability of false positive frame detection
>> for random data, but I might be calculating incorrectly...  For
>> simplicity, given the smallest size frame header, there are 180 valid
>> combinations in 48 bits, which is about 1 in 1.56e12.  Then if you take
>> into account the CRC-16 of the previous frame, that makes it around 1 in
>> 1.02e17.  But given that the sync code is mostly a string of 1's, the
>> probability is likely higher than that with non-random data.
> 
> Well lets first check where our differences come from
> 15bit for the sync
> about 2-3 bits checked in the header
> 8bit header CRC
> 16bit frame CRC
> 15+3+8+16= 42
> 2^42 = 4tb ~ 4.4e12
> 
> are you maybe counting the sync twice?

I think we were counting different things. Your way seems more correct.

How do you calculate how many bits are checked for 1 invalid value in a
4-bit code? 2 invalid values in a 3-bit code? etc...

I do like the idea of using header sequences, although the buffer will
be much larger.  I'll respond separately to that email.

-Justin