[FFmpeg-devel] FFV1 Specification
michaelni at gmx.at
Sat Apr 7 14:57:15 CEST 2012
On Sat, Apr 07, 2012 at 10:44:45AM +0930, Rodney Baker wrote:
> On Sat, 7 Apr 2012 07:26:47 Michael Niedermayer wrote:
> > On Fri, Mar 30, 2012 at 11:53:58AM +0200, Michael Niedermayer wrote:
> > > Hi
> > >
> > > Just wanted to announce that ive moved the ffv1 spec to github and
> > > i am working on cleaning it up and updating it to match the existing
> > > implementation.
> > >
> > > see: https://github.com/FFmpeg/FFV1
> > >
> > > patches, pull requests and comments are like always, welcome
> > latest draft at github and at:
> > http://ffmpeg.org/~michael/ffv1-draft/ffv1.html
> > If someone could read through it and point out where its unclear or
> > incomplete, that would be very helpfull!
> > I imagine i can easyly miss incompletenesses given that i know the
> > codec pretty well ...
> > Also spellcheck/grammer/formating tips are welcome too!
> > [...]
> Comments re spelling/grammar/style (I'll leave the technical review to others
> who know what they're talking about). :-)
> Section 3:
> >"In the case of the JPEG2000-RCT colorspace the lines are interleaved to
> >reduce cache trashing as most likely the RCT will be immedeatly converted to
> >RGB during decoding, the order of the lines in the interleaving is again
> >Y,Cb,Cr. "
> In the case of the JPEG2000-RCT colorspace the lines are interleaved to reduce
> cache trashing since it is most likely that the RCT will immediately be
> converted to RGB during decoding; the interleaved coding order is also
> [Not sure about "cache trashing" - sounds too "colloquial" for a technical
> document - is there a better way to say this? Perhaps, "to improve caching
Yes "to improve caching efficiency" works great
> >Samples within a plane are coded in raster scan order (left->right, top-
> >bottom), each sample is predicted by the median predictor from samples in the
> same plane and the difference is stored
> s/bottom), each/bottom). Each/ OR s/bottom), each/bottom); each/
> s/stored/stored./ (Apparently missing full-stops in many other places, too).
> Is this sentence incomplete? How is the difference stored?
> Section 3.1:
> > For the purpose of the predictior and context samples above the coded
> picture are assumed to be 0, right of the coded picture are identical to the
> closest left sample. And left of the coded picture are identical to the top
> right one if such exist or 0.
> s/0, right/0; samples to the right/
> s/left sample. And/left sample; samples to the left/
> s/top right one if such exist or 0/top right sample (if there is one),
> otherwise 0./
> Section 3.6:
> >Instead of coding the n+1 (or n+2 in the case of RCT) bits of the sample
> difference with huffman or range coding only the n (or n+1) least significant
> bits are used as thats enough the recover the original sample. bits in the
> equation below is bits_per_raw_sample+1 for RCT and bits_per_raw_sample
> Instead of coding the n+1 bits of the sample difference with huffman or range
> coding (or n+2 bits, in the case of RCT), only the n (or n+1) least
> significant bits are used, since this is sufficient to recover the original
> In the equation below, bits represents bits_per_raw_sample+1 for RCT or
> bits_per_raw_sample otherwise.
> s/H.264. But/H.264, but/
> s/situation as well as its slightly worse performance CABAC/situation (as well
> as its slightly worse performance) CABAC/
that makes it look like we care more about patants than performance
> Non binary values:
> s/integers, we could simply encode/ integers it would be possible to encode/
> s/context, but /context, however/ OR s/context, but/context but/ (I like the
> first option better).
> s/symbol which is not only a waste of memory but also requires more past
> data/symbol which requires both more memory and more past data
> s/reasonable/a reasonably/
> s/Alternatively simply assuming/Alternatively, assuming/
> s/mean like we do in huffman coding mode would be another possibility/
mean (as in huffman coding) would also be possible/
> s/but due to flexibility and simplicity, another method was chosen, which
> simply/ however, for maximum flexibility and simplicity, the chosen method/
> s/mantisse and sign, the exact contexts which are used/mantissa and sign. The
> exact contexts used/
> s/can probably better be described by the following code then by some english
> text/are best described by the following code, followed by some comments./
> Need definitions in the definitions/glossary section for VLC and ESC (and if
> we are to be pedantic MSB and any other acronyms used, unless they are
> considered to be so commonly in use among the target audience as to be
> completely unambiguous - MSB may well fall into this category).
> Fix spacing between Suffix/non ESC and non ESC/Examples.
> run mode/run length coding/level coding - capitalisation?
> s/mode, and/mode and/
> s/difference, on/ difference. On/
> s/improved the compression rate a bit/slightly improved the compression rate./
> (unless you meant literally "one bit").
> 4.2 Header:
> >version 0 or 1
> >coder_type Coder used, 0 (Golomb Rice), 1 (Range coder), 2 (Range coder with
> custom state transition table)
> >state_transition_delta The range coder custom state transition table. If it
> is not coded, all its elements are assumed to be 0.
> >colorspace_type 0 (YCbCr), 1 (JPEG2000_RCT)
> >chroma_planes 1 for color, 0 for grayscale
> >bits_per_raw_sample The number of bits for each sample, commonly 8, 9, 10 or
> >h_chroma_subsample The subsample factor between luma and chroma width
> (chroma_width = 2 − log2_h_chroma_subsampleluma_width)
> >v_chroma_subsample The subsample factor between luma and chroma height
> (chroma_height = 2 − log2_v_chroma_subsampleluma_height)
> >alpha_plane 1 if a transparency plane is stored, 0 otherwise
> Need delimiters between value names and descriptions. Might be better in a
this issue seems specific to the html output, dunno, maybe it can be
fixed by changing configuration somehow, tips welcome ...
other changes integrated
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The misfortune of the wise is better than the prosperity of the fool.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: Digital signature
More information about the ffmpeg-devel