bug-libunistring
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-libunistring] feature request: unicode version


From: Roger Crew
Subject: Re: [bug-libunistring] feature request: unicode version
Date: Mon, 21 Aug 2023 19:14:03 -0700

I wrote:
 > > it would be nice if there were a way to determine the Unicode 
 > > version supported, say, if <unistring/version.h> also had
 > > 
 > >   extern const int _libunistring_unicode_version

Bruno Haible replies:
 > How would this be useful?
 > 
 > I mean, Unicode is an evolving standard, and libunistring
 > (installed as a shared library) can be upgraded to the next
 > version any time. That is, the behaviour of specific functions
 > may slightly change as a new Unicode version is enabled.

That's the point.  One never knows quite what one is linking to 
and this would be an easy way to find out.

E.g., a client may care about there being support for some particular
block of obscure traditional Chinese characters or some set of
emojis/whatever that didn't go in until Unicode 13.  Or if I'm
depending on some character functionality that wasn't fully settled
until Unicode 12 and my stuff needs to work for people whose hosting
providers are still stuck on Debian 11 (= libunistring 0.9.10 =
Unicode 9) and so I need to know whether to invoke my workaround, or
at least be able to warn them they should upgrade.

And while this can be inferred (sort of) from the libunistring
version, you don't actually publish a correspondence (*)

... i.e., yes, you *have* mentioned at least *some* of the Unicode db
updates in the ChangeLog (thanks), and I can hope that you'll continue
to do so, but there isn't really a contract there, i.e., the way there
would be with an actual API var/function.

With libicu, I can just do

    UVersionInfo uver;
    u_getUnicodeVersion(uver);
    if (uver[0] < 12) {
       /* work around whatever was broken with
          HANGUL CHOSEONG CEONGCHIEUMSSANGCIEUC (**)
          in previous versions of unicode ...*/
    }

(and yes, ICU is insane bloatware that makes things far more
complicated than they need to be, and I get that you want to be
careful about not expanding libunistring's footprint too much, 
but I'm only asking for 4 bytes here ... :-)

(*) I suppose that would be the other way of dealing with this, e.g.,
have some official table on a gnu.org webpage, but it seems to me
keeping something like that up to date would be far more annoying 
than just doing it in the code.

(**) yes, I'm making this up.  My actual situation is that I need 
to provide unicode support in a programming language, and I really 
have no idea what my users will be doing with it, but, "How up 
to date is your Unicode implementation?  Do you support version N?"
are fairly obvious questions for them to ask.

Under the hood, the installation will use whichever of libunistring 
or libicu is available, but They shouldn't need to know or care 
about that.  (*I* will, of course, prefer libunistring because 
it's like 10 times faster, but anyway...)

 > Apple made the big mistake to pick a specific Unicode version
 > (3.2 IIRC) in their HFS+ file system. And they could not upgrade
 > it afterwards, because of "backward compatibility". Big mistake!
 > HFS+ was stuck with Unicode 3.2 even 15 or 20 years later.

I wasn't planning to fixate on a particular version.

 > > I've inferred the following from the ChangeLog history:
 > > 
 > > lib        unicode
 > > 1.1.0      15.0.0
 > > 1.0.0      14.0.0
 > > 0.9.8       9.0.0
 > 
 > Yes, from memory I think this correspondence is correct.

Thanks.

-- 
Roger Crew
wrog@wrog.net, crew@cs.stanford.edu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]