[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#35939: version sort is incorrect with hyphen-minus
From: |
Ian Jackson |
Subject: |
bug#35939: version sort is incorrect with hyphen-minus |
Date: |
Thu, 27 Jun 2019 11:25:01 +0100 |
Vincent Lefevre writes ("Re: bug#35939: version sort is incorrect with
hyphen-minus"):
> On 2019-06-26 18:40:50 -0700, Paul Eggert wrote:
> > Perhaps the coreutils manual could be improved to make this all clearer, and
> > perhaps it should refer to the Debian manual if it doesn't already.
>
> In this case, there should be a new ordering option to provide
> true numeric sort with strings mixing non-negative integers and
> characters.
I think the Debian algorithm is such an algorithm, but it has a
wrinkle which you are not expecting. Here is the specification:
https://www.debian.org/doc/debian-policy/ch-controlfields.html#version
Note in particular
| The lexical comparison is a comparison of ASCII values modified so
| that all the letters sort earlier than all the non-letters and so
| that a tilde sorts before anything, even the end of a part
So in the Debian algorithm, `-' sorts after `a'. I specified this
rule. I did it mainly because of versions like `1.0beta3', which is
is probably a prerelease of `1.0' and therefore earlier than `1.0.3'.
So `b' has to sort before `.' and my rule seemed the simplest one to
achieve that. (The version comparison algorithm is a tradeoff between
complexity, and breadth of support for people's then-existing
practices.) Nowadays Debian invariably writes `1.0~beta3' but when I
invented this scheme I did not include the (invaluable) `~' feature.
When this is extended to UTF-8, presumably the ordering should be an
ordering of unicode scalar values, with the rule about letters
interpreted as referring to anything which Unicode considers a letter.
If you want to test the Debian algorithm and have access to a copy of
dpkg, you can append -1 to both strings to be the "Debian revision",
and prepend "1:" to be the "epoch", and then the middle part should be
compared the same way as sort -V etc.
Vincent, what is your use case for a comparison algorithm which is
like the Debian one but which sorts letters after punctuation ?
Ian.
--
Ian Jackson <address@hidden> These opinions are my own.
If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.
bug#35939: version sort is incorrect with hyphen-minus, Ian Jackson, 2019/06/26
bug#35939: version sort is incorrect with hyphen-minus, Vincent Lefevre, 2019/06/26