[Ffmpeg-devel] retrieving asf textual info in other languages
Hauke Duden
H.NS.Duden
Tue May 17 11:06:04 CEST 2005
M?ns Rullg?rd wrote:
>"??" <c_liao at openfind.com.tw> writes:
>
>
>
>>>note, the thing must be converted to utf8
>>>
>>>
>>>Is it acceptable to use iconv() for the conversion?
>>>
>>>
>>isn't libiconv GPL'ed ? so.. the simplest solution is just to treat
>>it as a buffer and let user handle it, however I haven't dived into
>>other parts of the code so not sure whether will cause any
>>problems. or use iconv as default routine for conversion and have a
>>callback interface for user to define their own conversion routine
>>if they don't wish to use GPL'ed iconv.
>>
>>
>
>iconv is part of glibc, and specified by SUSv3, so there should be no
>legal implications from using it, even if some particular
>implementation is under the GPL.
>
>My main concern was portability to non-SUS platforms.
>
>
Sorry to intrude here, but UTF-8 is very simple. Why not simply convert
it yourself? Below is a simple striaghtforward encode routine from
unicode char to UTF-8, if you need one. Since UCS-2 is a subset of
Unicode this should work for it as well Use it as you like.
// Returns new offset. Return value=destOffset means either the buffer
// is too small or the character is not unicode.
int utf8EncodeChar(int chr,unsigned char* pDest,int destBytes,int
destOffset)
{
int destFree=destBytes-destOffset;
if(chr<=0x7f)
{
//one byte
if(destFree<1)
return destOffset;
pDest[destOffset]=chr;
return destOffset+1;
}
else if(chr<=0x7ff)
{
//two bytes
if(destFree<2)
return destOffset;
pDest[destOffset+1]=(chr & 0x3f) | 0x80;
pDest[destOffset]=(chr>>6) | 0xc0;
return destOffset+2;
}
else if(chr<=0xffff)
{
//three bytes
if(destFree<3)
return destOffset;
pDest[destOffset+2]=(chr & 0x3f) | 0x80;
pDest[destOffset+1]=((chr>>6) & 0x3f) | 0x80;
pDest[destOffset]=(chr>>12) | 0xe0;
return destOffset+3;
}
else if(chr<=0x10ffff) //the biggest UTF-8 value for 4 bytes is
actually 0x1fffff but that is not unicode anymore
{
//four bytes
if(destFree<4)
return destOffset;
pDest[destOffset+3]=(chr & 0x3f) | 0x80;
pDest[destOffset+2]=((chr>>6) & 0x3f) | 0x80;
pDest[destOffset+1]=((chr>>12) & 0x3f) | 0x80;
pDest[destOffset]=(chr>>18) | 0xf0;
return destOffset+4;
}
else
{
//not a unicode character.
return destOffset;
}
}
int utf8GetEncodedCharBytes(int chr)
{
if(chr<=0x7f)
return 1;
else if(chr<=0x7ff)
return 2;
else if(chr<=0xffff)
return 3;
else if(chr<=0x10ffff) //the biggest UTF-8 value for 4 bytes is
actually 0x1fffff but that is not unicode anymore
return 4;
else
return 0;
}
More information about the ffmpeg-devel
mailing list