[FFmpeg-devel] [BUG] UTF-8 decoder vulnerable to character spoofing attacks

Mon Oct 22 19:15:35 CEST 2007

> > i would first like to understand under what circumstances the current code
> > is causing a real problem (security or normal bug)
> 
> It could actually cause crash, e.g. if a string is right at the top of
> the heap and contains nothing but 0xff, then the UTF-8 decoder will
> read 8 bytes and crash.

Not, the comment says quite clearly that GET_UTF8 may read up to 7
bytes. It is a usage error to use it when it is not possible to read at least
an additional 7 bytes.

> As I said, it will also incorrectly decode illegal aliases for
> characters rather than signalling error. I don't think this will lead
> to vulns in the current code (since UTF-8 decoder is hardly used
> anyway), but it's bad to have buggy code that could cause problems
> when someone uses it in the future. Even if not, it teaches people who
> read it extremely bad practices.

Well, I always thought those bugs are due to extremely bad practices in
checking data. At least I always considered UTF-8 as a method of
compressing 32 bit data. We would rightly call anyone who does security
checking of zlib compressed data via strchr or so an idiot, and I am
slightly inclined to do the same with everyone who does this with UTF-8
data (except for speed reasons, but then if you do speed optimizations
on security critical code you should take appropriate care).

Greetings,
Reimar D?ffinger