[Ffmpeg-devel] [PATCH] increase max numbers of B frames

Tue Feb 21 22:07:08 CET 2006

On Tue, 21 Feb 2006, Erik Slagter wrote:
> On Tue, 2006-02-21 at 13:44 +0100, skal wrote:
>
>>> PBBBPBBBP: again, each frame is bigger than it would be in 2 consecutive,
>>> and now there's a decent chance that 67%->75% B-frames isn't enough to
>>> pay off. Though it's a bit more useful than in mpeg4, due to B-pyramid
>>> (which I won't detail here) and miscellaneous other improvements in the
>>> codec.
>>
>> Maybe this point is worth some details: h264 permits use the middle
>> B-slice as reference, making the reference anchors not so far from
>> the remaining inbetween b-slices (non-refs). So, you're back to
>> deciding PbBbP over PbPbP for the refs, which is quite similar to
>> the PBP vs. PPP decision, as far as refs placement is concerned.
>> Well, this is quite an uncharted (and slippery) land, but some
>> are advocating use of GOPs as big as 32 (!):
>
> A very crude and non-representative test ;-) reveals that
>
> - without b_pyramid: 3 b frames is optimal (1784 kb/s)
> - with b_pyramid: 5 b frames is optimal (1636 kb/s)
>
> Both show no significant improvement with larger amounts of B frames
> inserted. Nice to know :-)
>
> For this test I took a short clip, with forced B frames, otherwise the
> encoder would not select more than 1 B frame at a time...

Do not compare just bitrate. Remember that one of the reasons B-frames 
improve compression is that they allow asymmetric allocation of quality, 
whereby the B-frames are quantized more and the P-frames less. But if you 
encode with constant QP, then it quantizes B-frames more and doesn't 
change the P-frames to match, thus decreasing overall quality a little. 
You could encode with a different constant QP, but QP's granularity is too 
coarse for such a comparison.
So run 2pass encodes with the same target bitrate, and compare PSNR or 
other quality metric. (That being easier than encoding with a target PSNR 
and comparing bitrate.) Keeping in mind that PSNR sucks, it's still better 
than ignoring quality entirely.

> What is the suggested (default) gop size for h264 anyway? I was under
> the impression something like 250 was sort of optimal for (old style)
> mpeg4.

Confusing choice of terms. Skal's 32 was the repetition period, i.e.
IbBbBbBbBbBbBbBbBbBbBbBbBbBbBbBbPbBbBbBbBbBbBbBbBbBbBbBbBbBbBbBbP
... though I'm sure that in practice you wouldn't use a constant period, 
just like a constant number of conventional B-frames isn't optimal. (Ok, 
so some sources really do suggest using 32-frame GOPs with no P-frames. 
They're just wrong.)

There is no optimal GOP size in h264. Bigger GOP = better compression, up 
until 1 GOP = 1 movie scene, at which point increasing the allowed GOP 
size won't affect the encode at all. Because the codec will choose to put 
an I-frame at the scenecut no matter how big the GOP is allowed to be.
The only tradeoff is compression vs seeking and error resilience. There 
are other ways to deal with error resilience that are much better than 
extra I-frames, so I consider only compression vs seeking. The default is 
GOP=250 because 10 seconds is a reasonable seeking granularity, and it's 
not worth the interface complexity to make it depend on framerate. Keep in 
mind that this is only a worst-case; if scenes are shorter than 10 
seconds, you'll still be able to seek to each scenecut.

This is not the same as in mpeg*, where there is an additional factor 
arguing for smaller GOPs: DCT drift. The mpeg4 standard recommends no more 
than about 100 P-frames (actually they give a specific number, I don't 
know where they got it from), and mpeg2 recommends 12-18 frames, or 4-6 
P-frames. As long as you use the same implementation of DCT in both the 
encode and decoder, the length doesn't matter. But if they differ (and 
many mpeg2 codecs are pretty sloppy about this), then you get accumulated 
error. (For some reason, the most common symptom I've seen of this is 
horizontal green and purple stripes.)

--Loren Merritt