[FFmpeg-devel] [PATCH 2/5] fate: avoid framemd5, use framecrc its faster

Fri May 17 20:49:19 CEST 2013

2013/5/17 Reimar Döffinger <Reimar.Doeffinger at gmx.de>:
> On Fri, May 17, 2013 at 07:58:54PM +0200, Giorgio Vazzana wrote:
>> diff --git a/libavutil/md5.c b/libavutil/md5.c
>> index f8f08f1..a212150 100644
>> --- a/libavutil/md5.c
>> +++ b/libavutil/md5.c
>> @@ -146,8 +146,12 @@ void av_md5_update(AVMD5 *ctx, const uint8_t
>> *src, const int len)
>>      j = ctx->len & 63;
>>      ctx->len += len;
>>
>> -    for (i = 0; i < len; i++) {
>> -        ctx->block[j++] = src[i];
>> +    i = 0;
>> +    while (i < len) {
>> +        int l = (len-i <= 64-j) ? len-i : 64-j;
>
> FFMIN(len - i, 64 - j);

Of course.

>> +        memcpy(ctx->block + j, src + i, l);
>> +        i += l;
>> +        j += l;
>
> Personally I think it would make much more sense
> to change the code to copy at most 63 bytes (i.e.
> if there were bytes left over from last time)
> and use the rest directly from src.

I thought about this this morning, in fact I even tried a quick&dirty
test to comment out the memcpy() like this:

diff --git a/libavutil/md5.c b/libavutil/md5.c
index f8f08f1..95e506a 100644
--- a/libavutil/md5.c
+++ b/libavutil/md5.c
@@ -146,10 +146,14 @@ void av_md5_update(AVMD5 *ctx, const uint8_t
*src, const int len)
     j = ctx->len & 63;
     ctx->len += len;

-    for (i = 0; i < len; i++) {
-        ctx->block[j++] = src[i];
+    i = 0;
+    while (i < len) {
+        int l = (len-i <= 64-j) ? len-i : 64-j;
+//        memcpy(ctx->block + j, src + i, l);
+        i += l;
+        j += l;
         if (j == 64) {
-            body(ctx->ABCD, (uint32_t *) ctx->block);
+            body(ctx->ABCD, (uint32_t *) (src+i));
             j = 0;
         }
     }

of course the sum will be wrong now, but what's interesting is that I
still got 1.85s, so memcpy or no memcpy the difference was indeed very
small at least on this machine (old Sempron 64bit).
Anyway, I'm glad I was able to help spot the problem.

> In "body()" you'd then have to use AV_RL32.
> That might cost performance on architectures
> that do not support unaligned reads, but they are
> so rare, and I doubt even there the cost will be
> much higher than for the memcpy.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel