[FFmpeg-devel] GSoC with FFMpeg waht a combination!

Michael Niedermayer michaelni
Sat Mar 22 13:29:40 CET 2008

On Sat, Mar 22, 2008 at 12:37:39AM -0400, Ronald S. Bultje wrote:
> Hi,
> On Sat, Mar 22, 2008 at 12:19 AM, Michael Niedermayer <michaelni at gmx.at>
> wrote:
> > On Sat, Mar 22, 2008 at 05:49:08AM +0200, Jason (spot) Brower wrote:
> > > internationalization of the program.
> >
> > Why do you think there would be a problem here?<https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel>
> Will you accept patches that add internationalization-support to
> ffmpeg/lav*?
> It's high on my fairy-list.

Sure just keep in mind that you will be flamed if this is done with gettext :)
The reason being
* gettext duplicates english strings all over the place
* gettext uses strings as keys (very inefficient requireing O(log n) lookups)

* getting a string must be O(1)
* no duplicate strings in the final binary files
* uncoordinated additon of new keys
* uncoordinated additon of new translated strings
* easy to use for both devels and translators

The requirement of no duplicate strings leads to integers as keys.
The requirement of O(1) and uncoordinated additon of new keys leads
to the use of a hash table and 64bit hash values of 
default (english/in source) strings as keys.

Thus binary files with translated strings would be a hash table of some sort
with pointers to the strings stored after it. This file could be a custom
format or a normal ELF object created from a c file.

Now how to handle this conveniently without reinventing the whole wheel.
The trick is to use gettext as much as possible :)
The code in the program, should look like
av_log(..., _("english string %d blah\n"), ...);
that way gettext tools can be used to extract these strings and build .po
files out of them which translators can translate (its totally identical to
gettext ...)

The difference happens afterwards, or more specificially
* A script would go over the source and replace all _("blah") by
  _(0x1275384ULL) where 0x1275384ULL is a strong 64bit hash of "blah"
* Another little program would parse the .po files and convert them to
  the hashtab of translated strings with hash(english string) as keys
  and store these in the translation files, one for each language.

At runtime we just need to load the correct file look in the hashtab and
return the string.

Comments welcome, alternative to above might be catgets, iam not sure though
man catgets is a little terse and google says catgets isnt thread safe ...

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080322/75650135/attachment.pgp>

More information about the ffmpeg-devel mailing list