[FFmpeg-devel] [PATCH] Support for UTF8 filenames on Windows
Fri Jun 26 18:10:26 CEST 2009
Ramiro Polla wrote:
>>>> MAX_PATH is defined to 260 in WinDef.h, and that is actually the maximum
>>>> allowed path length in the Win32 API unless you want to jump through some
>>>> hoops. Paths of up to 32,767 characters (approximately) are allowed, but
>>>> only if they are absolute and start with the magical \\?\ prefix. I guess
>>>> could do some detection of relative paths and add said magical prefix
>>>> manually if so desired, but the static allocation seems safe enough, and
>>>> 260 character limit is indeed what a vast majority of Windows programs
>>> Indeed, FFmpeg fails with long names. But if you truncate the long
>>> name, it might turn into a valid name (like Mans said).
>> Right, so if strlen(filename) > MAX_PATH, the function should fail? Or
>> should I try the long paths workaround? (It will be a minor pain to
>> implement, because detecting relative paths on Windows is pretty annoying.)
> IMHO we shouldn't try fiddling around with the paths much, but just
> pass them on to _open or _wopen. Nor should we check against MAX_PATH.
Well, sure, I could do dynamic allocation instead, but I don't know what happens
when you pass strings longer than MAX_PATH to _wopen; MSDN doesn't say. I don't
really see the point though because whatever happens, it won't be what the user
>>>> Updated patch with less tabs (and a rather embarrassing typo fix)
>>>> Karl Blomster
>>>> Index: libavformat/os_support.c
>>>> --- libavformat/os_support.c (revision 19266)
>>>> +++ libavformat/os_support.c (working copy)
>>>> @@ -30,6 +30,23 @@
>>>> #include <sys/time.h>
>>>> #include "os_support.h"
>>>> +#ifdef HAVE_WIN_UTF8_PATHS
>>>> +#define WIN32_LEAN_AND_MEAN
>>>> +#include <windows.h>
>>>> +#ifdef HAVE_WIN_UTF8_PATHS
>>>> +int winutf8_open(const char *filename, int oflag, int pmode)
>>>> + wchar_t wfilename[MAX_PATH * 2];
>>> Isn't sizeof(wchar_t) == 2?
>> Yes (at least on Win32), but characters outside the basic multilingual plane
>> requires two UTF-16 code units to express. Of course this is a bit esoteric
>> because the likelihood of such characters being used in filenames is very
>> low, but in theory it could happen and it's not like allocating 520 extra
>> bytes in a temporary buffer is going to kill anyone, so...
>>> I think you could also use wchar_t wfilename[strlen(filename) + 1]
>>> instead of malloc if we are going to try and pass paths larger than
>> The "proper" way would, I think, be to use
>> MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, filename, -1, NULL, 0)
>> first, because that returns the exact number of wide characters required to
>> store the string.
> I'd say use this approach to malloc the string then.
> But I'm still not really happy about having to choose at compile-time.
> Is there no way the user could specify it at run-time?
You could add an enable_win_utf8 parameter to av_open_input_file I guess but
that would be a really ugly thing to have in the API and I doubt it'd be OK'd.
This patch only changes the API, not the commandline interfaces and whatnot, so
the only users of it would be people who use the ffmpeg API, and those people
presumably compile ffmpeg themselves anyway and would know if they want UTF-8
support or not.
More information about the ffmpeg-devel