[FFmpeg-devel] [PATCH] lavu/avstring: add av_get_utf8() function

Mon Nov 11 11:37:06 CET 2013

On date Monday 2013-11-11 00:43:17 +0100, Michael Niedermayer encoded:
> On Sun, Nov 10, 2013 at 01:16:44PM +0100, Stefano Sabatini wrote:
[...]
> >  /**
> > + * Read and decode a single UTF-8 character sequence from buffer in
> > + * *buf, and update *buf to point to the next byte after the parsed
> > + * sequence.
> > + *
> > + * In case of invalid sequence, the pointer will be updated to the
> > + * next byte after the invalid sequence.
> > + *
> > + * @param code pointer whose pointed value is updated to keep the
> > + * parsed code in case of success
> > + * @param left bytes left to read in the buffer. By default it won't
> > + * read more than 6 characters (maximum number of bytes in an UTF-8
> > + * sequence).
> > + * @return >= 0 in case a sequence was successfully read, a negative
> > + * value in case of invalid sequence
> > + */
> > +int av_utf8_decode(int32_t *code, const uint8_t **buf, size_t left);
> 

> what is the relation of this to GET_UTF8()
> how should a developer choose which of the 2 to use ?

The GET_UTF8 macro should be faster but is more difficult to use and
must be used with caution, in particular the error code must be
handled with a return or goto instruction, and thus it is pretty
awkward for general use (indeed we have several misuses in the
lavfi/drawtext code).

Also it doesn't automatically handle error recovery and buffer
overread prevention.

> 
> also is there a performance difference?
> benchmark might be interresting

I could do it but given that the two have different uses I would spare
me the effort.
-- 
FFmpeg = Forgiving and Funny Mystic Practical Extended Gangster