[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-libunistring] feature request: unicode version
From: |
Roger Crew |
Subject: |
Re: [bug-libunistring] feature request: unicode version |
Date: |
Mon, 21 Aug 2023 19:14:03 -0700 |
I wrote:
> > it would be nice if there were a way to determine the Unicode
> > version supported, say, if <unistring/version.h> also had
> >
> > extern const int _libunistring_unicode_version
Bruno Haible replies:
> How would this be useful?
>
> I mean, Unicode is an evolving standard, and libunistring
> (installed as a shared library) can be upgraded to the next
> version any time. That is, the behaviour of specific functions
> may slightly change as a new Unicode version is enabled.
That's the point. One never knows quite what one is linking to
and this would be an easy way to find out.
E.g., a client may care about there being support for some particular
block of obscure traditional Chinese characters or some set of
emojis/whatever that didn't go in until Unicode 13. Or if I'm
depending on some character functionality that wasn't fully settled
until Unicode 12 and my stuff needs to work for people whose hosting
providers are still stuck on Debian 11 (= libunistring 0.9.10 =
Unicode 9) and so I need to know whether to invoke my workaround, or
at least be able to warn them they should upgrade.
And while this can be inferred (sort of) from the libunistring
version, you don't actually publish a correspondence (*)
... i.e., yes, you *have* mentioned at least *some* of the Unicode db
updates in the ChangeLog (thanks), and I can hope that you'll continue
to do so, but there isn't really a contract there, i.e., the way there
would be with an actual API var/function.
With libicu, I can just do
UVersionInfo uver;
u_getUnicodeVersion(uver);
if (uver[0] < 12) {
/* work around whatever was broken with
HANGUL CHOSEONG CEONGCHIEUMSSANGCIEUC (**)
in previous versions of unicode ...*/
}
(and yes, ICU is insane bloatware that makes things far more
complicated than they need to be, and I get that you want to be
careful about not expanding libunistring's footprint too much,
but I'm only asking for 4 bytes here ... :-)
(*) I suppose that would be the other way of dealing with this, e.g.,
have some official table on a gnu.org webpage, but it seems to me
keeping something like that up to date would be far more annoying
than just doing it in the code.
(**) yes, I'm making this up. My actual situation is that I need
to provide unicode support in a programming language, and I really
have no idea what my users will be doing with it, but, "How up
to date is your Unicode implementation? Do you support version N?"
are fairly obvious questions for them to ask.
Under the hood, the installation will use whichever of libunistring
or libicu is available, but They shouldn't need to know or care
about that. (*I* will, of course, prefer libunistring because
it's like 10 times faster, but anyway...)
> Apple made the big mistake to pick a specific Unicode version
> (3.2 IIRC) in their HFS+ file system. And they could not upgrade
> it afterwards, because of "backward compatibility". Big mistake!
> HFS+ was stuck with Unicode 3.2 even 15 or 20 years later.
I wasn't planning to fixate on a particular version.
> > I've inferred the following from the ChangeLog history:
> >
> > lib unicode
> > 1.1.0 15.0.0
> > 1.0.0 14.0.0
> > 0.9.8 9.0.0
>
> Yes, from memory I think this correspondence is correct.
Thanks.
--
Roger Crew
wrog@wrog.net, crew@cs.stanford.edu