Re: guarantees of u8_mbtouc/u8

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guarantees of u8_mbtouc/u8_strmbtouc

From:	Bruno Haible
Subject:	Re: guarantees of u8_mbtouc/u8_strmbtouc
Date:	Sat, 31 Jul 2010 23:01:56 +0200
User-agent:	KMail/1.9.9

Paolo Bonzini wrote:
> "u8_mbtouc will never access more than N bytes.  However, as an 
> additional guarantee, u8_mbtouc only accesses as many bytes as necessary 
> to decode the first Unicode character, or to ascertain that S does not 
> begin with a valid UTF-8 sequence."

This is complicated to understand, because it requires the programmer to
understand how a Unicode character is parsed.

> > The code may be changed in the future. If a guarantee is not documented AND
> > checked by the test suite, you cannot rely on it.
> 
> Of course, that's why I'm suggesting a modification to the specification.

What's the use case which would profit from such a guarantee?
libunistring supports two string data types: one where the length of the
string (number of units) is known, and one which is U+0000 terminated.
Are you suggesting that these two data types are not sufficient to cover
the users' needs?

If your only point is to save a couple of instructions, then's it's a too
small benefit, in my opinion.

Bruno

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v2 0/5] Speed up uNN_chr and uNN_strchr with Boyer-Moore algorithm, (continued)
- Re: [PATCH v2 0/5] Speed up uNN_chr and uNN_strchr with Boyer-Moore algorithm, Pádraig Brady, 2010/07/23
  - Re: tabs, Bruno Haible, 2010/07/28

Prev by Date: Re: gnulib new testdir / current build messages
Next by Date: ansi-c++-opt: exploit new autoconf feature
Previous by thread: Re: guarantees of u8_mbtouc/u8_strmbtouc
Next by thread: Re: [PATCH v2 0/5] Speed up uNN_chr and uNN_strchr with Boyer-Moore algorithm
Index(es):
- Date
- Thread