[Ffmpeg-devel] retrieving asf textual info in other languages
Tue May 17 18:21:59 CEST 2005
On Tue, May 17, 2005 at 10:12:26AM +0200, M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
> > Hi
> >> Hi guys:
> >> What am I trying to do is to extract textual information from asf type
> >> containers, namely the fields: title, author, copyright and comments. I'm
> >> using libavformat to accomplish this. The problem I encountered is that the
> >> extracted asf information within the AVFormatContext is corrupted. As I
> >> know the textual information stored within asf containers should be in
> >> unicode (UCS2), so I compiled a debug version of the libavformat and dug
> >> deeper, this is what I found:
> >> what I can see here is copying a 16 bit int (c) to a 8 bit char (*q), if
> >> I'm not mistaken this would cut the higher 8 bit, this is fine for ascii
> >> character which will just leave a lower ascii byte but will corrupt any
> >> other language that also uses the higher byte. To confirm this I've also
> >> tried some ffmpeg based player such as videoLan, and the result is the
> >> same, no encoding other than ascii is shown in text fields. So this problem
> >> definitly affects all asian languages.
> >> So here are my questions: is this cutting of higher byte a deliberate act
> >> to avoid EOS '\0' character ? or is this a bug ? and will you guys consider
> >> a patch or fix soon for this ?
> > send a patch, if its clean and working it will be considered
> > note, the thing must be converted to utf8
> Is it acceptable to use iconv() for the conversion?
Umm, why? The original data is ucs2 or utf16. Conversion is totally
straightforward; it's just a matter of interpreting one unicode
encoding and writing another. No actual character remapping.
BTW I'm against iconv dependency. There are all kinds of compatibility
issues between different versions, and it's huge bloat/overkill.
More information about the ffmpeg-devel