[Ffmpeg-devel] Re: [PATCH] Tiertex .SEQ files support

Mon Sep 25 09:48:35 CEST 2006

Hi

On Sun, Sep 24, 2006 at 10:59:10PM +0200, Gregory Montoir wrote:
[...]
> >[...]
> >>+static unsigned char *seq_decode_op2(SeqVideoContext *seq, unsigned char 
> >>*src, unsigned char *dst)
> >>+{
> >>+    int b, i, len;
> >>+    GetBitContext gb;
> >>+
> >>+    len = *src++;
> >>+    if (len & 0x80) {
> >>+        switch (len & 15) {
> >>+        case 1:
> >>+            src = seq_unpack_rle_frame(src, seq->unpack_buffer, 
> >>sizeof(seq->unpack_buffer));
> >>+            for (b = 0; b < 8; ++b) {
> >>+                memcpy(dst, &seq->unpack_buffer[b * 8], 8);
> >>+                dst += seq->frame.linesize[0];
> >>+            }
> >>+            break;
> >>+        case 2:
> >>+            src = seq_unpack_rle_frame(src, seq->unpack_buffer, 
> >>sizeof(seq->unpack_buffer));
> >>+            for (i = 0; i < 8; ++i) {
> >>+                for (b = 0; b < 8; ++b)
> >>+                    dst[b * seq->frame.linesize[0]] = 
> >>seq->unpack_buffer[i * 8 + b];
> >>+                ++dst;
> >>+            }
> >>+            break;
> >>+        }
> >
> >i would move the call to seq_unpack_rle_frame before the switch so that 
> >there
> >just a single call, that way the cost for inlinng the function would be
> >smaller for the compiler (if it does inline it ...)
> 
> If you move that call before the switch, you should also handle the case
> where (len & 3) == 0 (this probably never happens, I need to check the
> sample files, but I would prefer being "compatible" with the original
> playback code).

if((len&15) == 1 || (len&15) == 2)
    src = seq_unpack_rle_frame(src, seq->unpack_buffer, sizeof(seq->unpack_buffer));

> 
> 
> >>+    } else {
> >>+        unsigned char *color_table = src;
> >>+        src += len;
> >>+        if (len > 8) {
> >>+            init_get_bits(&gb, src, 32 * 8); src += 32;
> >>+            for (b = 0; b < 8; ++b) {
> >>+                for (i = 0; i < 8; ++i)
> >>+                    dst[i] = color_table[get_bits(&gb, 4)];
> >>+                dst += seq->frame.linesize[0];
> >>+            }
> >>+        } else if (len > 4) {
> >>+            init_get_bits(&gb, src, 24 * 8); src += 24;
> >>+            for (b = 0; b < 8; ++b) {
> >>+                for (i = 0; i < 8; ++i)
> >>+                    dst[i] = color_table[get_bits(&gb, 3)];
> >>+                dst += seq->frame.linesize[0];
> >>+            }
> >>+        } else if (len > 2) {
> >>+            init_get_bits(&gb, src, 16 * 8); src += 16;
> >>+            for (b = 0; b < 8; ++b) {
> >>+                for (i = 0; i < 8; ++i)
> >>+                    dst[i] = color_table[get_bits(&gb, 2)];
> >>+                dst += seq->frame.linesize[0];
> >>+            }
> >>+        } else {
> >>+            init_get_bits(&gb, src, 8 * 8); src += 8;
> >>+            for (b = 0; b < 8; ++b) {
> >>+                for (i = 0; i < 8; ++i)
> >>+                    dst[i] = color_table[get_bits(&gb, 1)];
> >>+                dst += seq->frame.linesize[0];
> >>+            }
> >>+        }
> >>+    }
> >
> >why not: (assuming its not slow)
> >
> >if(len > 8) bits= 4;
> >else        bits= foobar_table[len]; (or some av_log2 based stuff)
> >(init_get_bits(&gb, src, 8*8*bits); src += 8*bits)
> >for (b = 0; b < 8; ++b) {
> >    for (i = 0; i < 8; ++i)
> >        dst[i] = color_table[get_bits(&gb, bits)];
> >    dst += seq->frame.linesize[0];
> >}
> 
> Looks nice, I rewrote that part using a custom log2 table (I haven't
> been able to re-use the existing av_log2 table, as I need to "round up"
> the result rather that rounding down).

what about av_log2(a-1)+1

> 
> 
[...]
> >>+        seq_fill_buffer(seq, seq->current_frame_data[5],
> >>+          seq->current_frame_data + offset,
> >>+          end_offset - offset);
> >>+    }
> >>+    offset = LE_16(&seq->current_frame_data[10]);
> >>+    if (offset != 0) {
> >>+        end_offset = LE_16(&seq->current_frame_data[12]);
> >>+        if (end_offset == 0)
> >>+            end_offset = LE_16(&seq->current_frame_data[14]);
> >>+        seq_fill_buffer(seq, seq->current_frame_data[6],
> >>+          seq->current_frame_data + offset,
> >>+          end_offset - offset);
> >>+    }
> >>+    offset = LE_16(&seq->current_frame_data[12]);
> >>+    if (offset != 0) {
> >>+        end_offset = LE_16(&seq->current_frame_data[14]);
> >>+        seq_fill_buffer(seq, seq->current_frame_data[7],
> >>+          seq->current_frame_data + offset,
> >>+          end_offset - offset);
> >>+    }
> >
> >for(i=0; i<3; i++)
> >    buf_num[i]= get_byte();
> >for(i=0; i<4; i++)
> >    offset[i]= get_le16();
> >end_offset[3]= 0;
> >for(i=2; i>=0; i--)
> >    end_offset[i]= offset[i+1] ? offset[i+1] : end_offset[i+1]
> >
> >for(i=0; i<3; i++){
> >    if(offset[i])
> >        seq_fill_buffer(seq, buf_num[i],
> >            seq->current_frame_data + offset[i],
> >            end_offset[i] - offset[i]);
> >}
> >
> >dont you think that something like the above code is more readable? also 
> >there are no url_fseek() needed ...
> 
> Indeed, the for loop makes the code more friendly.
> 
> But still, I don't see how you can use the get_* functions on
> ByteIOContext *and not* using url_fseek later on... Looking at your
> pseudo code, seq_fill_buffer takes an offset as argument. So if you want
> to copy data from a given offset, without having read the whole 6kb
> buffer before, I will have to "seek". Same goes for the sound and
> palette data.
> 
> Am I missing something ?

the few url_fseek() will likely be faster then the memcpy(), benchmarks
are of course welcome ...

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is