[FFmpeg-devel] [PATCH] DES en-/decryption

Michael Niedermayer michaelni
Sun Oct 14 23:40:43 CEST 2007


Hi

On Sun, Oct 14, 2007 at 02:05:00PM +0200, Reimar D?ffinger wrote:
> Hello,
> On Sat, Oct 13, 2007 at 10:15:42PM +0200, Reimar D?ffinger wrote:
> > some DES implementation is needed for asf decryption support.
> > This version, compared to the one contained in the previous asf patch,
> > should be quite a bit faster (not benchmarked though).
> > If you think this is not optimized well enough to belong in libavutil I
> > tend to suggest to leave it in libavformat unless someone else is
> > interested in improving it.
> > I am interested in hearing your comments either way.
> [...]
> 
> Newer version. I ignored your optimization suggestions for the round key
> stuff - I extracted it from the main loop, so if performance ever
> matters it will be trivial to put it into its own function or some
> similar optimization and it should be nearly irrelevant for performance
> then.
> I will add license header and multiple inclusion guards to the .h file
> if you insist though I find it really stupid for a single line (and I
i dont care about the license header or inclusion guards, also diego will add
them anyway, so no need to do that yourself :)

> have serious doubts any of this code will ever again be touched by
> someone).

well who knows ...


[...]
> +static uint64_t shuffle(uint64_t in, const uint8_t *shuffle, int shuffle_len) {
> +    int i;
> +    uint64_t res = 0;
> +    for (i = 0; i < shuffle_len; i++)
> +        res = (res << 1) | ((in >> *shuffle++) & 1);

res += res + ((in >> *shuffle++) & 1);


> +    return res;
> +}
> +

> +static uint64_t shuffle_inv(uint64_t in, const uint8_t *shuffle, int shuffle_len) {
> +    int i;
> +    uint64_t res = 0;
> +    shuffle += shuffle_len - 1;
> +    for (i = 0; i < shuffle_len; i++) {
> +        res |= (in & 1) << *shuffle--;
> +        in >>= 1;
> +    }
> +    return res;
> +}

is this function bigger than the 64 byte needed for the inverse IP table?


> +
> +static uint32_t f_func(uint32_t r, uint64_t k) {
> +    int i;
> +    uint32_t out = 0;
> +    // rotate to get first part of E-shuffle in the lowest 6 bits
> +    r = (r << 1) | (r >> 31);
> +    // apply S-boxes, those compress the data again from 8 * 6 to 8 * 4 bits

> +    for (i = 7; i >= 0; i--) {
> +        uint8_t v;
> +        uint8_t tmp = (r ^ k) & 0x3f;
> +#ifdef CONFIG_SMALL
> +        v = S_boxes[i][tmp >> 1];
> +        if (tmp & 1) v >>= 4;
> +        else v &= 0x0f;
> +        out = (out >> 4) | (v << 28);

the v&0x0f is useless its shifted out by <<28 anyway


> +#else
> +        out |= S_boxes_P_shuffle[i][tmp];

maybe
out |= P_shuffle2[ S_boxes2[i][tmp] ];

would be worth a try?
it would cut the needed table size down by 2 at the expense of one more
table lookup
(yes S_boxes2 would contain i so that P_shuffle2 could correctly reshuffle
it)


> +#endif
> +        // get next 6 bits of E-shuffle and round key k into the lowest bits
> +        r = (r >> 4) | (r << 28);
> +        k >>= 6;
> +    }

the  uint8_t v; can be moved into CONFIG_SMALL


[...]

> +    uint64_t CDn = shuffle(key, PC1_shuffle, sizeof(PC1_shuffle));
> +    // generate round keys
> +    for (i = 0; i < 16; i++) {
> +        CDn = key_shift_left(CDn);
> +        if (i > 1 && i != 8 && i != 15)

if((i&7) && i != 15)


> +            CDn = key_shift_left(CDn);
> +        K[i] = shuffle(CDn, PC2_shuffle, sizeof(PC2_shuffle));
> +    }
> +    // shuffle irrelevant to security but to ease hardware implementations
> +    in = shuffle(in, IP_shuffle, sizeof(IP_shuffle));
> +    for (i = 0; i < 16; i++) {
> +        uint32_t f_res;

> +        f_res = f_func(in, K[decrypt ? 15 - i : i]);

K[i ^ X]
and X= decrypt ? 15 : 0; outside the loop
should be faster and might even be smaller or not


> +        in = (in << 32) | (in >> 32);
> +        in ^= f_res;
> +    }

maybe

for (i = 0; i < 8; i++) {
    in ^= f_func(in    , K[decrypt ? 15 - i : i]) << 32;
    in ^= f_func(in>>32, K[decrypt ? 15 - i : i]);
}

would be faster/smaller? maybe not ...


> +    in = (in << 32) | (in >> 32);
> +    // reverse shuffle used to ease hardware implementations
> +    in = shuffle_inv(in, IP_shuffle, sizeof(IP_shuffle));

if the inv table is split off and shuffle is used than the 32bit swap can
be mergeed into the table


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I hate to see young programmers poisoned by the kind of thinking
Ulrich Drepper puts forward since it is simply too narrow -- Roman Shaposhnik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20071014/0e52e00e/attachment.pgp>



More information about the ffmpeg-devel mailing list