[FFmpeg-devel] Understanding motion vectors from ff_print_debug_info()

Wed Aug 1 07:55:51 CEST 2007

On Sun, 29 Jul 2007, Pravin Bhat wrote:
> Loren Merritt wrote:
>>
>> Sub-sampling is where you delete some of the samples, leaving lower
>> resolution. That is not happening here.
>> Motion vectors are specified to fractional-sample precision, e.g a vector
>> of (0.5, 0) means translate the video by half a pixel (interpolating
>> between samples in the reference frame).
>> quarter_sample=1 means the precision is 0.25, and the alternative is 0.5.
>> The numbers in motion_val[] are fixed-point representations of those
>> fractional vectors.
>
> Great. That explains a lot!
> How does one obtain the precision of the motion vectors for a given
> codec. For example, AVFrame.motion_subsample_log2 is set (hopefully)
> based on the codec used to decode the frame. Is there a similar field
> that tells you the precision value used by the codec?

As you said, s->quarter_sample. But that's not in the public API, and it
can't even be derived from the codec name since some codecs make qpel
optional.

>>> What worries me is that the addition of that '1' is conditional and I
>>> don't understand when not to add that '1'. For example
>>> ff_print_debug_info() uses the following formula:
>>> (s->mb_width << mv_sample_log2) + (s->codec_id == CODEC_ID_H264 ? 0 : 1)
>>
>> otoh, ffh264 uses some optimizations that require the addresses of
>> individual motion vectors to be aligned, and it doesn't benefit from
>> the dummy values.
>
> Is ffh264 the only codec that doesn't use padding? In other words, is
> the formula being used by ff_print_debug_info() general enough to handle
> most codecs?

yes

>>> # Finally, is there any way to populate the AVFrame.motion_val array
>>> with the motion vectors without having the motion vectors visualized on
>>> the decoded video frame? I'm setting AVCodecContext.debug_mv = 1 to
>>> obtain the motion vectors but as a side effect that flag also draws the
>>> motion vectors. It looks like ff_print_debug_info() always draws the
>>> motion vectors when the debug_mv flag is set. It seems wasteful to have
>>> to decode the video twice in order to obtain the motion vectors and the
>>> video frames without the visualization.
>>
>> AVFrame.motion_val is always populated, if the codec has motion vectors
>> at all. The motion vectors have to be stored somewhere, so there's no
>> penalty for exporting them.
>
> Great! Is there a field that gets set if the codec has no motion
> vectors? Also, I was hoping there was a way to tell which motion vectors
> are wrong or not used in the frame construction. For example, certain
> blocks in B-frames are not constructed using the previous frame. It
> would be nice to be able to determine that the motion vectors for such
> blocks are invalid.

AVFrame.motion_val should be NULL if it's not populated.
AVFrame.mb_type contains the block types, each of which is a bitmask of 
some of the MB_TYPE_* flags. So it's a bit complicated to determine which 
mvs are valid, but it should be codec independent.

> Also, I'm working on an algorithm that tries to deblock highly compressed 
> videos. In the paper I would like to compare the algorithm
> against against existing state-of-art deblocking algorithms. I was 
> wondering:
> - have you done any quantitative testing of the deblocking performance of 
> x264 (say by comparing the psnr response of the compressed
> versus the deblocked video).
> - have you compared the deblocking performance of x264 to other 
> algorithms?

Are you designing a postprocessor, or modifying the codec itself?
Because h264's deblocking algorithm only works as an in-loop filter, not
as a postprocessor. In particular, it assumes the reference frame was
already deblocked, so that any regions of the frame that are only motion
compensated without a residual don't need to be deblocked again. Unless
they're a border between two motion blocks with different mvs, in which
case it still deblocks with less strength than places with a residual.

Designing a postprocessor is harder. Blocking artifacts that you dealt
with in one frame might still be present in the next, and might no longer
be aligned to block positions due to mc.

Also note that in h264 there is no significant difference in psnr
between any given frame before and after deblocking (although there is a
difference in perceived artifacts). It only improves the
psnr-per-bitrate of subsequent frames that are predicted from the
deblocked frame.

--Loren Merritt