[FFmpeg-devel] [PATCH] h264: integrate clear_blocks calls with IDCT.

Michael Niedermayer michaelni at gmx.at
Fri Feb 8 18:53:54 CET 2013


On Fri, Feb 08, 2013 at 09:09:41AM -0800, Ronald S. Bultje wrote:
> Hi,
> 
> On Fri, Feb 8, 2013 at 4:51 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Thu, Feb 07, 2013 at 10:20:52PM -0800, Ronald S. Bultje wrote:
> >> From: "Ronald S. Bultje" <rsbultje at gmail.com>
> >>
> >> In case of no-transform, integrate it with put_pixels4/8(). In case
> >> of intra PCM, do an explicit memset(0) call. Together, this makes
> >> the H264 decoder almost-independent of dsputil.
> >>
> >> (PPC and Arm assembly not yet ported.)
> >
> > breaks fate-h264-lossless
> 
> Will check, I thought it worked locally.
> 
> >> diff --git a/libavcodec/h264_mb_template.c b/libavcodec/h264_mb_template.c
> >> index a841555..4fe2030 100644
> >> --- a/libavcodec/h264_mb_template.c
> >> +++ b/libavcodec/h264_mb_template.c
> >> @@ -133,6 +133,7 @@ static av_noinline void FUNC(hl_decode_mb)(H264Context *h)
> >>                      }
> >>                  }
> >>              }
> >> +            memset(h->mb, 0, ff_h264_mb_sizes[h->sps.chroma_format_idc] * 2);
> >>          } else {
> >>              for (i = 0; i < 16; i++)
> >>                  memcpy(dest_y + i * linesize, (uint8_t *)h->mb + i * 16, 16);
> >> @@ -151,6 +152,7 @@ static av_noinline void FUNC(hl_decode_mb)(H264Context *h)
> >>                      }
> >>                  }
> >>              }
> >> +            memset(h->mb, 0, ff_h264_mb_sizes[h->sps.chroma_format_idc]);
> >>          }
> >>      } else {
> >>          if (IS_INTRA(mb_type)) {
> >
> > why should these (and other cases) use memset instead of (the faster)
> > clear_block* ?
> 
> The first is init code, so speed is irrelevant.
> 
> The second remaining use of memset(0) is intra PCM. Two comments on
> that. First, from h264_cabac.c:
>         // We assume these blocks are very rare so we do not optimize it.
>         // FIXME The two following lines get the bitstream position in the cabac
>         // decode, I think it should be done by a function in cabac.h
> (or cabac.c).
> I like that concept if it removes dsputil dependencies.
> 
> Then second:
>         // The pixels are stored in the same order as levels in h->mb array.
>         if ((int) (h->cabac.bytestream_end - ptr) < mb_size)
>             return -1;
>         memcpy(h->mb, ptr, mb_size); ptr+=mb_size;
> There is no reason to memcpy here, the source data remains valid. We
> can just store the pointer and read from there directly. If we do
> that, we don't need to clear h->mb for intraPCM anymore either. That
> was my longer-term plan, I can do that in the next patch iteration.

i dont object to replacing clear_block* where it has no meassureable
speed effect
but iam not sure if making only specific dsp util contexts but not
a misc (what doesnt fit in the others) context available to
h264 is really a good idea.
It might work out now but the first function that one wants to
optimize that doesnt fit in the existing specific dsp contexts would
need to be put somewhere else.
And if a misc context is available, clear_block probably would be
in it and could then still be used instead of memset even if the
speed difference is negligible

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In fact, the RIAA has been known to suggest that students drop out
of college or go to community college in order to be able to afford
settlements. -- The RIAA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130208/5b114f0b/attachment.asc>


More information about the ffmpeg-devel mailing list