[FFmpeg-devel] Understanding motion vectors from ff_print_debug_info()

Sun Jul 29 11:37:00 CEST 2007

Loren Merritt wrote:
> On Sat, 28 Jul 2007, Pravin Bhat wrote:
>
>   
>> # It looks like you have to divide by 2 (or a higher power of 2) the
>> values in the motion_val array
>> to obtain the true motion vectors. Why is this so?
>>
>> For example ff_print_debug_info() processes the motion_val vectors in
>> the following manner:
>> motion_val[direction][xy][0]>>shift
>> where shift = 1 + s->quarter_sample.
>> The 'quarter_sample' variable makes sense. It's accounting for some sort
>> of mode where the video is
>> sub-sampled by 2 in the X and Y axis. But I don't understand why the '1'
>> is added to 'shift' in effect
>> dividing all motion vectors by 2.
>>     
>
> Sub-sampling is where you delete some of the samples, leaving lower 
> resolution. That is not happening here.
> Motion vectors are specified to fractional-sample precision, e.g a vector 
> of (0.5, 0) means translate the video by half a pixel (interpolating 
> between samples in the reference frame).
> quarter_sample=1 means the precision is 0.25, and the alternative is 0.5.
> The numbers in motion_val[] are fixed-point representations of those 
> fractional vectors.
>   
Great. That explains a lot!
How does one obtain the precision of the motion vectors for a given 
codec. For example, AVFrame.motion_subsample_log2 is set (hopefully) 
based on the codec used to decode the frame. Is there a similar field 
that tells you the precision value used by the codec?

>   
>> # For computing the 'mv_stride' variable avcodec.h suggests the
>> following formula:
>> mv_stride= (mb_width << mv_sample_log2) + 1
>> While I don't understand why that '1' is added mv_stride.
>>     
>
> Some internal code is simplified if we put a ring of dummy values around 
> the edge of the array, such that the left neighbor of the leftmost 
> macroblock lands in the dummy values rather than landing on the right edge 
> of the frame.
>
>   
>> What worries me is that the addition of that '1' is conditional and I
>> don't understand when not to add that '1'. For example
>> ff_print_debug_info() uses the following formula:
>> (s->mb_width << mv_sample_log2) + (s->codec_id == CODEC_ID_H264 ? 0 : 1)
>>     
>
> otoh, ffh264 uses some optimizations that require the addresses of 
> individual motion vectors to be aligned, and it doesn't benefit from 
> the dummy values.
>   

Is ffh264 the only codec that doesn't use padding? In other words, is 
the formula being used by ff_print_debug_info() general enough to handle 
most codecs?

>   
>> # Finally, is there any way to populate the AVFrame.motion_val array
>> with the motion vectors without having the motion vectors visualized on
>> the decoded video frame? I'm setting AVCodecContext.debug_mv = 1 to
>> obtain the motion vectors but as a side effect that flag also draws the
>> motion vectors. It looks like ff_print_debug_info() always draws the
>> motion vectors when the debug_mv flag is set. It seems wasteful to have
>> to decode the video twice in order to obtain the motion vectors and the
>> video frames without the visualization.
>>     
>
> AVFrame.motion_val is always populated, if the codec has motion vectors 
> at all. The motion vectors have to be stored somewhere, so there's no 
> penalty for exporting them.
>   

Great! Is there a field that gets set if the codec has no motion 
vectors? Also, I was hoping there was a way to tell which motion vectors 
are wrong or not used in the frame construction. For example, certain 
blocks in B-frames are not constructed using the previous frame. It 
would be nice to be able to determine that the motion vectors for such 
blocks are invalid.

Thanks Loren. Your reply really helped.
-pro