bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: join bug?


From: german rigau
Subject: Re: join bug?
Date: Mon, 6 Feb 2006 17:58:53 +0100

On 2/6/06, Paul Eggert <address@hidden> wrote:
>
> german rigau <address@hidden> writes:
>
> > Obviously, the problem is in the sort command. With C locale
> > runs perfectly. However, I use LANG=en_US.UTF-8 ...
> > And then it seems that the "sort" command have different behaviour ...
>
> I don't see any bug in the examples that you gave.


Sorry for insisting.

If you see carefully the last example I sent, we obtain two different
sortings with locale en_US.UTF-8 ... with "sort kk2" we obtain "icecream"
before "ice_cream" and with "sort -k 1,2 kk2" we obtain "ice_cream" before
"icecream"!

However, we obtain with "sort kk2" and "sort -k 1,2 kk2" the same ordering
with locale C. ??

Getting back to the original question, "join" must use the same
> collating convention that "sort" does.  If you "sort" in the
> en_US.UTF-8 locale, you must "join" in the same locale.  Otherwise, as
> you discovered, things won't work in general.


No. I use the same collating for sorting and joining. This is why I
detected the abnormality: join failed to locate the same elements
ordered by the default sorting ...

Also, my advice is to stick with the C locale unless you know what
> you're doing.


I think I know perfectly what I am doing ... ;-)

 For example, if you're not sure what you want to do in
> the case of encoding error (or, if you don't know what an encoding
> error is :-), then you should stick with the C locale.
>
> But, this is only useful for encoding English language ... and
there are many more around.

Best,

German


reply via email to

[Prev in Thread] Current Thread [Next in Thread]