[FFmpeg-devel] [PATCH] mxf umid generation

Baptiste Coudurier baptiste.coudurier
Wed Mar 11 07:15:49 CET 2009


Michael Niedermayer wrote:
> On Sun, Mar 08, 2009 at 03:00:18PM -0700, Baptiste Coudurier wrote:
>> On 3/8/2009 2:08 PM, Michael Niedermayer wrote:
>>> On Sun, Mar 08, 2009 at 01:05:55PM -0700, Baptiste Coudurier wrote:
>>>> On 3/8/2009 7:10 AM, Michael Niedermayer wrote:
>>>>> On Sat, Mar 07, 2009 at 08:53:03PM -0800, Baptiste Coudurier wrote:
>>>>>> On 3/7/2009 8:40 PM, Michael Niedermayer wrote:
>>>>>>> On Sat, Mar 07, 2009 at 08:05:41PM -0800, Baptiste Coudurier wrote:
>>>>>>>> On 3/7/2009 7:52 PM, Michael Niedermayer wrote:
>>>>>>>>> On Sat, Mar 07, 2009 at 07:25:54PM -0800, Baptiste Coudurier wrote:
>>>>>>>>>> On 3/7/2009 7:16 PM, Michael Niedermayer wrote:
>>>>>>>>>>> On Sat, Mar 07, 2009 at 06:31:53PM -0800, Baptiste Coudurier wrote:
>>>>>>>>>>>> On 3/7/2009 5:23 PM, Michael Niedermayer wrote:
>>>>>>>>>>>>> On Sat, Mar 07, 2009 at 04:14:19PM -0800, Baptiste Coudurier wrote:
>>>>>>>>>>>>>> On 3/7/2009 3:36 PM, Michael Niedermayer wrote:
>>>>>>>>>>>>>>> On Sat, Mar 07, 2009 at 02:48:49PM -0800, Baptiste Coudurier wrote:
>>>>>>>>>>>>>>>> On 3/6/2009 7:44 PM, Michael Niedermayer wrote:
>>>>>>>>>>>>>>>>> On Fri, Mar 06, 2009 at 07:28:55PM -0800, Baptiste Coudurier wrote:
>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>>> Property changes on: libavutil\random_seed.c
>>>>>>>>>>>>>>>> ___________________________________________________________________
>>>>>>>>>>>>>>>> Added: svn:eol-style
>>>>>>>>>>>>>>>>    + LF
>>>>>>>>>>>>>>> intended?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> except these ok
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure, I'm on windows because products I test with only works on
>>>>>>>>>>>>>> windows, I set ending lines style to unix, but it keeps adding this...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It should be ok I think.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Btw, is lfg ok for this purpose or should I use something else ?
>>>>>>>>>>>>> to generate these id numbers out of the seed?
>>>>>>>>>>>>> id rather use
>>>>>>>>>>>>> seed += 1LL<<32 :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> lfg seens pointless complexity for this ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> and patch ok
>>>>>>>>>>>>>
>>>>>>>>>>>> Well I'd like to use the defined methods for umid generation, see below:
>>>>>>>>>>>>
>>>>>>>>>>>> "A.3.2 Alternative masking methods
>>>>>>>>>>>> The masked material number is an unpredictable number uniformly
>>>>>>>>>>>> distributed over the range 0 thru 2^128-1. Its
>>>>>>>>>>>> effectiveness as a unique identifier relies on this uniform random
>>>>>>>>>>>> distribution, and the exact method of its generation is not
>>>>>>>>>>>> important. Therefore, the use of the reference masking method is not
>>>>>>>>>>>> normative, and any method providing an equivalent
>>>>>>>>>>>> level of unpredictability and uniformity of distribution may be used
>>>>>>>>>>>> with the ?masked method? value in the ?number
>>>>>>>>>>>> generation method? field of the UMID universal label (reference table 1
>>>>>>>>>>>> in 5.1.1)."
>>>>>>>>>>>>
>>>>>>>>>>>> And instance generation:
>>>>>>>>>>>>
>>>>>>>>>>>> "B.2 24-bit PRS generator (?2h?)
>>>>>>>>>>>> Any suitable psuedo-random sequence (PRS) generator polynomial may be
>>>>>>>>>>>> used provided it has a maximal length of
>>>>>>>>>>>> 16,777,215 clock cycles. At the point of creating a new instance of the
>>>>>>>>>>>> material, the 24-bits from the PRS generator are
>>>>>>>>>>>> sampled to gain a new instance value.
>>>>>>>>>>>> PRS generators shall not allow a zero value.
>>>>>>>>>>> am i right in assuming that this "definition" is a 24bit LFSR?
>>>>>>>>>>> if so, this is neither uniform over 2^128 nor unpredictable.
>>>>>>>>>>> actually, its trivial to generate all future and past values
>>>>>>>>>>> from just 2 24bit values even if the used polynomial is not known.
>>>>>>>>>>>
>>>>>>>>>>> also if my interrpretation of this "definition" is correct you can
>>>>>>>>>>> expect 1 collision in ~4000 ids
>>>>>>>>>> Well, "instance number" is 3 bytes and umid is 16 bytes, these are
>>>>>>>>>> different numbers, this is what the code is trying to achieve, see the
>>>>>>>>>> patch.
>>>>>>>>>>
>>>>>>>>>>>> NOTES
>>>>>>>>>>>> 1 Any suitable seed may be used to start the pseudo-random sequence
>>>>>>>>>>>> (PRS) 24-bit generator.
>>>>>>>>>>>> 2 The PRS generator should use a free-running clock having no time
>>>>>>>>>>>> relationship with the clock used to generate the sampling strobe.
>>>>>>>>>>>> 3 The PRS generator clock frequency should be greater than 10 kHz.
>>>>>>>>>>>> 4 The number of feedback taps resulting from the PRS generator
>>>>>>>>>>>> polynomial should be between 8 and 16 to ensure the random nature
>>>>>>>>>>>> of the sequence."
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think ?
>>>>>>>>>>> sounds like the spec is writen by some really incompetent people.
>>>>>>>>>> Is it still true now you know that these numbers are different ?
>>>>>>>>> a design that cannot be implemented in ANSI C or for the matter of fact
>>>>>>>>> any deterministic language is broken
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Is the method ok at least for the "instance number" ?
>>>>>>>>>>
>>>>>>>>>>> [...]
>>>>>>>>>>>
>>>>>>>>>>> also you should set any bits left after the seed+counter to a random constant
>>>>>>>>>>>
>>>>>>>>>>> and if you have a 32bit seed you have 32bit of randomness and a PRNG making
>>>>>>>>>>> 128 out of that still has just a randomness of 32, you could set 96 bits to
>>>>>>>>>>> your pets name it wont make a difference.
>>>>>>>>>>>
>>>>>>>>>> So there is no way we could be able to generate the 128 bits umid
>>>>>>>>>> according to the method ? Can I use the md5 of the 4 bytes of the seed ?
>>>>>>>>> you can but it will have a collision in ~64k umids thats the same as if you
>>>>>>>>> just take the 32bit seed + some constant like your name.
>>>>>>>>> also it violates the spec because it is neither unpredictable not uniform.
>>>>>>>>>
>>>>>>>>> If you want to follow the spec you need 128 strong random bits per umid.
>>>>>>>>> a md5 of 32 LSB from the timer does not qualify ...
>>>>>>>>>
>>>>>>>> All right, what do you think about this ?
>>>>>>> if /dev/random is available better if just a timer is available
>>>>>>> you leak the cpu type & compiler version used.
>>>>>> Well, I added ff_random_get_seed according to your indication :/
>>>>> I didnt suggest to use more than 1 value from it
>>>>>
>>>> You mean using more than once "seed" ?
>>> yes, using the lsb of the timer twice stores the number of clock ticks
>>> between the 2 calls, that is strongly dependant on cpu & compiler thus
>>> leaks this information while not doing much else
>>>
>>>
>>>> Well I'm following your suggestion of using only seed.
>>>>
>>>> Can I use LFG like I did in the first place ? I'd like to have this
>>>> problem fixed.
>>> you are the maintainer of the code, you can use what you prefer
>>>
>>> If you want to conform to the spec you need 16(+3)bytes
>>> from /dev/random or if that is unavailable gather bits by user interaction
>>> and its timing like some other tools do.
>>> And you have to redo this for each mxf file muxed.
>>>
>>> if you dont care about the spec and just want to minimize collisions
>>> the 32bit seed alone will be as good as filling all 16 bytes by lfg
>> I'll use the "undefined method" value I think.
>>
>> Ok, what if I use seed only once like this ?
>> I prefer having your opinion on randomness and security matter.
> 
> i think its fine, though the +(1<<32) is useless, i misunderstood
> the original use.
> That is i thought that you needed several unique 128bit values and
> these could have been generated by
> for()
>     seed += 1LL<<32;
> with just a single value the +1<<32 makes no difference
> 
> [...]
>> +static void mxf_gen_umid(AVFormatContext *s)
>> +{
>> +    MXFContext *mxf = s->priv_data;
>> +    uint32_t seed = ff_random_get_seed();
>> +    uint64_t umid = seed + (1LL<<32) + 0x152947134;
>> +
>> +    AV_WB64(mxf->umid  , umid);
>> +    AV_WB64(mxf->umid+8, umid>>8);
>> +
>> +    mxf->instance_number = seed;
>> +}
> 
> also it should be  + 0x15294713400000000LL
> 
> The reason is best explained by thinking about playing a game like lotto
> In theory assuming the numbers are randomly drawn any set/list of numbers
> choosen will give you the same chance to get the price.
> But in reality, picking "1,2,3,4,5,6" while it gives you the same chance
> to win you would in that case have a very high chance that you have to
> share with many other people who choose the same common sequence.
> 
> In our case here while leaving 96 of 128 bits 0 is as good as any other
> fixed value or one that just depends on the first 32bit (like LFG)
> it is much more likely that some other program would make a similar choice
> for the high 96 bits. Setting them to a ffmpeg specific constant would
> make collisions between ffmpeg generated files and files by other muxers
> VERY unlikely.
> 

All right, applied, thanks a lot for the advices and the review.

-- 
Baptiste COUDURIER                              GnuPG Key Id: 0x5C1ABAAA
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
checking for life_signs in -lkenny... no
FFmpeg maintainer                                  http://www.ffmpeg.org




More information about the ffmpeg-devel mailing list