[FFmpeg-devel] file protocol with Unicode support

Kirill Gavrilov gavr.mail at gmail.com
Wed Apr 13 15:01:33 CEST 2011


2011/4/13 Nicolas George <nicolas.george at normalesup.org>

> > But parsing input file name as UTF-8 only for Windows we make confusion
> in
> > API.
> The confusion in API is already there: Unix filenames are sequences of
> bytes, windows filenames are sequences of 16-bits words.
>
No the problem is that Linux (and probably Unix - I'm not have enough
experience with others) and Windows come different ways in providing Unicode
support.
Linux migrate to UTF-8 which is compatible with original APIs designed for
ASCII or custom characters set.
Windows leaves custom characters set as is and define new functions to
operate ONLY Unicode strings (wchar_t).
So while Linux throw overboard the KOI8-R/CP1251 and so on and uses UTF-8
instead, Windows still uses these ugly code pages
but unambiguous Unicode (wchar_t) functions WinAPI functions are strongly
recommended for development.

> However some people that currently can enter filenames in system-defined
> > code page (and contains the symbols outside the ASCII) will got open file
> > error.
>
> You mean that the current, unchanged API, already allows to open i18ned
> filenames?
>
> That not what I understood from the first mail.
>
> What _is_ the problem with the current situation?
>
The problem with symbols outside your code page.
You just can not map special French symbols to 8-bit Russian code page, thus
this conversion is lossy
and standard non-wide functions open/fopen can not be used to open such
files.
Thats the problem.

2011/4/13 Nicolas George <nicolas.george at normalesup.org>

> Le quartidi 24 germinal, an CCXIX, Kirill Gavrilov a écrit :
> > Something like this (in attach)?
>
> This looks right; but do not take it as a definite answer.
>
> > But parsing input file name as UTF-8 only for Windows we make confusion
> in
> > API.
>
> The confusion in API is already there: Unix filenames are sequences of
> bytes, windows filenames are sequences of 16-bits words.
>
> > However some people that currently can enter filenames in system-defined
> > code page (and contains the symbols outside the ASCII) will got open file
> > error.
>
> You mean that the current, unchanged API, already allows to open i18ned
> filenames?
>
> That not what I understood from the first mail.
>
> What _is_ the problem with the current situation?
>
> > +/* system dependent open file function (redirect to _wopen() on Windows)
> */
> > +int ff_open_file(const char *filename, int oflag, int pmode);
>
> Nitpick: doxygen-style comment.
>
> Regards,
>
> --
>   Nicolas George
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
>
> iEYEARECAAYFAk2llkAACgkQsGPZlzblTJOL/QCfamxd+E3mOV20PjieFimv0AZP
> UQwAn0uqvspvHFzM4q+6jMH+rzgz9q3n
> =8SWg
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
-----------------------------------------------
Kirill Gavrilov,
Software designer.


More information about the ffmpeg-devel mailing list