[FFmpeg-devel] [PATCH] libvpx: alt reference frame / lag

On Thu, Jun 17, 2010 at 11:42:28AM -0400, John Koleszar wrote:
> On Thu, Jun 17, 2010 at 12:24 AM, Reimar D?ffinger
> <Reimar.Doeffinger at gmx.de> wrote:
> > On Wed, Jun 16, 2010 at 10:30:50PM -0400, John Koleszar wrote:
> >> I still think this multiple packet approach is very much KISS, and
> >> it's not just libvpx that it's simple for. The other part of that rule
> >> is "make it as simple as possible, but no simpler."
> >
> > No it really isn't. It introduces data packets that produce no data.
> > The fact that no other format has this means this needs extensive
> > testing and honestly won't work for quite a few applications
> > (won't work with MEncoder, won't work with MPlayer if you try to
> > override -fps to some fixed value, probably won't work with ffmpeg
> > if you try to use -fps, I doubt it will work with any Video for Windows
> > application, for which VirtualDub is an example, and I think that
> > is not possible to fix).
> >
> Yes, it doesn't work if you only support fixed frame rates. But VFR is
> a valuable feature, and it's required to be supported in WebM
> independent of invisible ARFs, so this isn't a new requirement.

There is a new requirement even for VFR.
For almost all VFR content (to be honest anything except WMV) it does
not really matter if your pts/dts are one frame off.
With your scheme, being one frame off however would cause judder for
each ARF.
No problem if you just pack it with the next frame.

> >> >> There are existing
> >> >> applications that very much care about the contents of each reference
> >> >> buffer and what's in each packet, this isn't a hypothetical like
> >> >> decoding a single slice.
> >> >
> >> > Which applications exactly? What exactly are they doing? And why exactly
> >> > do they absolutely need to have things in a separate packet?
> >>
> >> I'm not going to name names, but I'm talking specifically about video
> >> conferencing applications. I should have been more precise here --
> >> these applications aren't using invisible frames today (though they do
> >> use the alt-ref buffer) but I called them out because they're the type
> >> of applications that are *very* concerned with what's going on in the
> >> encoder, they will want to use invisible frames in the future, and
> >> they'll need access to the frames in the most fine-grained way
> >> possible.
> >
> > Do you have any argument except repeating that they will need that?
> > I can only assume they will want to drop frames behind anyone's back.
> > In that case the questions are
> > 1) Do they really need to be able to drop the frame after ARF?
> > 2) Do they really have to be able to do that without parsing the ARF?
> > 3) Do they really need to be able to do that before having received
> > ? the full data of the frame following the ARF?
> >
> > Even if all of them are true, it would be possible to append extra data
> > to help this case after each frame that would allow splitting of ARF
> > data, which should be backwards-compatible with all existing decoders
> > and even for any other application it should only impede their ability
> > to drop the frame right after an ARF
> Here's another example: consider a VNC type application. You could use
> the ARF buffer to keep a low-quality version of parts of the screen
> that are currently hidden. You might go a long time without ever
> updating a visible part of the screen, but still do updates to the
> hidden part of the screen. This could give a better experience when
> dragging windows around. In this case, there would be many invisible
> packets, and there wouldn't be a visible packet to pack them with.

Well, at least you still won't have to reuse the same time stamp for all
of them, so the main problem does not really exist.
Also I would assume that VP8 has some code that basically just says:
100% non-coded frame, just re-displayed the last one which could be used
if really wanted.
And even AVI has always (?) supported 0-size packets to indicate "nothing
new here", so having packets that do not really change anything (even
non-0 size ones I think) isn't anything that new and has a good chance
of working.

> >> I've been through a lot of the advantages of keeping the data
> >> separate, but it mostly boils down to staying out of the way of
> >> applications that know what they're doing
> >
> > I think you are arguing that we should make things more complex for
> > everyone for the sake of simplifying things for a use-case that
> > currently does not exist and you don't specify well enough so we
> > could suggest and alternative.
> I don't want to be in the business of saying what people can and can't
> do with the codec, so I'm looking to provide flexibility. Yes, ARFs
> are a new feature that other codecs don't have. No, I don't know all
> the ways that people will find to use them yet.

Packing ARF does not make anything impossible, it just might make some
(potential, that nobody in reaality may ever care about) cases more
difficult, and even that can easily be "fixed" with just a tiny bit of
out-of-band data.
If there is no place anywhere at all for a bit user-specific data neither
in VP8, nor the chosen container then things might get a bit annoying.

