[FFmpeg-devel] [PATCH] libvpx: alt reference frame / lag

Thu Jun 17 06:24:12 CEST 2010

On Wed, Jun 16, 2010 at 10:30:50PM -0400, John Koleszar wrote:
> I still think this multiple packet approach is very much KISS, and
> it's not just libvpx that it's simple for. The other part of that rule
> is "make it as simple as possible, but no simpler."

No it really isn't. It introduces data packets that produce no data.
The fact that no other format has this means this needs extensive
testing and honestly won't work for quite a few applications
(won't work with MEncoder, won't work with MPlayer if you try to
override -fps to some fixed value, probably won't work with ffmpeg
if you try to use -fps, I doubt it will work with any Video for Windows
application, for which VirtualDub is an example, and I think that
is not possible to fix).

> >> There are existing
> >> applications that very much care about the contents of each reference
> >> buffer and what's in each packet, this isn't a hypothetical like
> >> decoding a single slice.
> >
> > Which applications exactly? What exactly are they doing? And why exactly
> > do they absolutely need to have things in a separate packet?
> 
> I'm not going to name names, but I'm talking specifically about video
> conferencing applications. I should have been more precise here --
> these applications aren't using invisible frames today (though they do
> use the alt-ref buffer) but I called them out because they're the type
> of applications that are *very* concerned with what's going on in the
> encoder, they will want to use invisible frames in the future, and
> they'll need access to the frames in the most fine-grained way
> possible.

Do you have any argument except repeating that they will need that?
I can only assume they will want to drop frames behind anyone's back.
In that case the questions are
1) Do they really need to be able to drop the frame after ARF?
2) Do they really have to be able to do that without parsing the ARF?
3) Do they really need to be able to do that before having received
   the full data of the frame following the ARF?

Even if all of them are true, it would be possible to append extra data
to help this case after each frame that would allow splitting of ARF
data, which should be backwards-compatible with all existing decoders
and even for any other application it should only impede their ability
to drop the frame right after an ARF

> I've been through a lot of the advantages of keeping the data
> separate, but it mostly boils down to staying out of the way of
> applications that know what they're doing

I think you are arguing that we should make things more complex for
everyone for the sake of simplifying things for a use-case that
currently does not exist and you don't specify well enough so we
could suggest and alternative.