[FFmpeg-devel] [PATCH 01/N] RV30/40 Decoder - Core

Michael Niedermayer michaelni
Sun Dec 9 14:47:36 CET 2007


On Sun, Dec 09, 2007 at 09:46:22AM +0200, Kostya wrote:
> Just some comments on implementation for now.
> 
> On Sun, Dec 09, 2007 at 12:00:04AM +0100, Michael Niedermayer wrote:
> > On Sat, Dec 08, 2007 at 12:10:17PM +0200, Kostya wrote:
> > > On Sat, Dec 08, 2007 at 05:32:07AM +0100, Michael Niedermayer wrote:
> > > > On Fri, Dec 07, 2007 at 09:04:07AM +0200, Kostya wrote:
> > > [...]
> > > > > > please check if the following are identical (+-1 doesnt count as
> > > > > > different)
> > > > > > 
> > > > > > svq3_get_ue_golomb() and 
> > > > > > svq3_get_se_golomb()
> > > > > > 
> > > > > > and if they are identical post START/STOP_TIMER scores
> > > > > > also id like to see START/STOP_TIMER scores in that case for the
> > > > > > original vlc based code
> > > > > > 
> > > > > > the fastest should be placed in golomb.c/h
> > > > > > and used
> > > > > 
> > > > > I don't know when I can do benchmarks but switched to svq3_get_[su]e_golomb()
> > > > > anyway.
> > > > 
> > > > ok, just dont forget about the benchmarks please
> > > 
> > > Hmm, I don't have large and long enough clip to show timer values.
> > 
> > no? it shows nothing? just a blank screen? or all have the same value? but
> > then you can show them ...
> > 
> > you can also put the START/STOP_TIMER over a block or mb decode function
> > or over the for loop in svq3.c after
> > /* decode prediction codes for luma blocks */
> > that contains always 8 svq3_get_ue_golomb()
> > 
> > 
> > > I guess it is fast enough. For the reference, rv34_inv_transform() takes 50 dezicycles.
> > 
> > its not the question if its fast enough i want to know which is the fastest
> 
> RV30 uses 8 of such codes for intra types and 2 for motion vectors.
> I tested it with rv34_decode_mb_header() on VC-RV30/simpsons-clip.rm
> Current implementation
> 1111 dezicycles in rv34_decode_mb_header, 1048018 runs, 558 skips
> My implementation
> 1126 dezicycles in rv34_decode_mb_header, 1047952 runs, 624 skips

ok


[...]
> [...]
> > 
> > 
> > [...]
> > >     if(!r->avail[0])
> > >         has_A = 0;
> > >     else{
> > >         int A_type = r->mb_type[mb_pos - 1];
> > >         has_A = A_type == RV34_MB_B_BIDIR || A_type == (dir ? RV34_MB_B_BACKWARD : RV34_MB_B_FORWARD);
> > > 
> > >         if(has_A){
> > >             A[0] = s->current_picture_ptr->motion_val[dir][mv_pos - 1][0];
> > >             A[1] = s->current_picture_ptr->motion_val[dir][mv_pos - 1][1];
> > >         }
> > >     }
> > >     if(!r->avail[1]){
> > >         has_B = 0;
> > >     }else{
> > >         int B_type = r->mb_type[mb_pos - s->mb_stride];
> > >         has_B = B_type == RV34_MB_B_BIDIR || B_type == (dir ? RV34_MB_B_BACKWARD : RV34_MB_B_FORWARD);
> > > 
> > >         if(has_B){
> > >             B[0] = s->current_picture_ptr->motion_val[dir][mv_pos - s->b8_stride][0];
> > >             B[1] = s->current_picture_ptr->motion_val[dir][mv_pos - s->b8_stride][1];
> > >         }
> > >     }
> > >     if(r->avail[2]){
> > >         int C_type = r->mb_type[mb_pos - s->mb_stride + 1];
> > >         has_C = C_type == RV34_MB_B_BIDIR || C_type == (dir ? RV34_MB_B_BACKWARD : RV34_MB_B_FORWARD);
> > > 
> > >         if(has_C){
> > >             C[0] = s->current_picture_ptr->motion_val[dir][mv_pos - s->b8_stride + 1][0];
> > >             C[1] = s->current_picture_ptr->motion_val[dir][mv_pos - s->b8_stride + 1][1];
> > >         }
> > >     }else if(r->avail[3]){
> > >         int C_type = r->mb_type[mb_pos - s->mb_stride - 1];
> > >         has_C = C_type == RV34_MB_B_BIDIR || C_type == (dir ? RV34_MB_B_BACKWARD : RV34_MB_B_FORWARD);
> > > 
> > >         if(has_C){
> > >             C[0] = s->current_picture_ptr->motion_val[dir][mv_pos - s->b8_stride - 1][0];
> > >             C[1] = s->current_picture_ptr->motion_val[dir][mv_pos - s->b8_stride - 1][1];
> > >         }
> > >     }else
> > >         has_C = 0;
> > 
> > the code above is quadrupled
> > also with the normal lavc mb_types the direction would be just a bit and
> > wouldnt need long and obfuscated checks
> 
> You mean something block_type & MB_TYPE_L0 ? 

yes


>  
> > [...]
> > > /**
> > >  * generic motion compensation function
> > >  *
> > >  * @param r decoder context
> > >  * @param block_type type of the current block
> > >  * @param xoff horizontal offset from the start of the current block
> > >  * @param yoff vertical offset from the start of the current block
> > >  * @param mv_off offset to the motion vector information
> > >  * @param width width of the current partition in 8x8 blocks
> > >  * @param height height of the current partition in 8x8 blocks
> > >  */
> > > static inline void rv34_mc(RV34DecContext *r, const int block_type,
> > >                           const int xoff, const int yoff, int mv_off,
> > >                           const int width, const int height, int dir,
> > >                           qpel_mc_func (*qpel_mc)[16],
> > >                           h264_chroma_mc_func (*chroma_mc))
> > > {
> [skipped body]
> > > }
> > 
> > how does this differ from the equivalent functions in h264.c?
> > that is mc_dir_part() IIRC
> 
> It will also handle 1/3pel for RV30

and svq3_mc_dir_part() does 1/3 pel ...


> 
> > [...]
> > >         for(j = 0; j < 2; j++){
> > >             for(i = 0; i < 2; i++){
> > >                 s->current_picture_ptr->motion_val[0][s->mb_x * 2 + s->mb_y * 2 * s->b8_stride + i + j*s->b8_stride][0] = 0;
> > >                 s->current_picture_ptr->motion_val[0][s->mb_x * 2 + s->mb_y * 2 * s->b8_stride + i + j*s->b8_stride][1] = 0;
> > >             }
> > >         }
> > 
> > fill_rectangle() should be used here, yes please
> > move fill_rectangle() into a file from where it can be used, its a very
> > usefull function
> 
> I guess it should be moved to h264.h. I will try to use it where appropriate.

i dont think it should be moved to h264.h its not h264 specific


> 
> > >         return 0;
> > 
> > >     case RV34_MB_SKIP:
> > >         if(s->pict_type == P_TYPE){
> > >             rv34_pred_mv(r, block_type, 0);
> > 
> > makes no sense, the predictor is always 0,0
> 
> Still, motion vectors need zeroing. Will replace with fill_rectangle() 

ok

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Let us carefully observe those good qualities wherein our enemies excel us
and endeavor to excel them, by avoiding what is faulty, and imitating what
is excellent in them. -- Plutarch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20071209/fbded2ff/attachment.pgp>



More information about the ffmpeg-devel mailing list