[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sort - specifying sort fields/keys.
From: |
Bob Proulx |
Subject: |
Re: sort - specifying sort fields/keys. |
Date: |
Tue, 8 Apr 2008 17:42:21 -0600 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
cga2000 wrote:
> Bob Proulx wrote:
> > You shouldn't ever need --color=never there unless you have aliased ls
> > with ls --color=always. You don't want ls --color=always (really you
> > don't) and therefore you won't need to undo it with --color=never.
>
> You're absolutely right.
>
> In fact I didn't want to have to worry about the effects of color escape
> codes on sort field counting at that point and that's why I initially
> specified --color=never.
It was pointed out to me in an offlist comment that sometimes people
*do want* ls --color=always. Such as when forcing it and piping it
into less with 'ls --color=always | less -R'. (Oh, I guess, yes. :-)
> On my system "ls" is aliased to "ls --color=auto".
>
> Regrettably the man page provided on my system lists the three options
> (never, always, auto) but does not give any explanation as to what they
> actually do.
The man pages are great for quick reference of major features. But
the primary documentation for most GNU software is in the info pages.
info coreutils 'ls invocation'
`--color [=WHEN]'
Specify whether to use color for distinguishing file types. WHEN
may be omitted, or one of:
* none - Do not use color at all. This is the default.
* auto - Only use color if standard output is a terminal.
* always - Always use color.
Specifying `--color' and no WHEN is equivalent to `--color=always'.
Piping a colorized listing through a pager like `more' or `less'
usually produces unreadable results. However, using `more -f'
does seem to work.
> I did a
>
> $ ls -al | sort -k1.1,1.1r -k8f
>
> and the output is identical (properly sorted with dots ignored)
Based upon locale setting, right?
> >
> > http://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021
>
> Thanks. Good doc.
The descriptions usually talk about LC_ALL because it gets
complicated. Really it is intended to set LANG. But LANG is
overridden by LC_COLLATE and so setting LANG may have no effect. But
LC_COLLATE is again overridden by LC_ALL. Saying all of that in the
quick docs gets complicated and still doesn't really describe things
like how it interacts with LC_CTYPE. I have no idea what (possibly
bad) effects there will be for setting an incompatible combination of
LANG, LC_CTYPE and LC_COLLATE will have on some languages. So it
simpler just to describe LC_ALL=C as the biggest possible lever. But
normally one would only set the lower priority locale vars such as
LANG and possibly LC_COLLATE such as I have done.
> > Apparently the people who defined the collating sequence for the en_*
> > locales confused working with data on a computer with working with
> > text on a computer. The locale collating sequences for en_* ignores
> > punctuation and folds case by default!
>
> Given the symptoms and the nature of sort I would probably never have
> figured that out myself. That there may be circumstances where this
> comes in handy I do not doubt .. But as to making it the default for one
> of (if not the) most widely-used locales?
It certainly annoys me. But they didn't consult me when the collating
sequence was chosen.
> I gave up on UTF-8 because I use mostly ELinks for browsing and afaik
> it's not UTF-8 ready.
How does ELinks compare to Links, Lynx, or w3m for UTF-8 support? I
only use them for basic plain text us-ascii pages and so can't judge.
> I tested with LC_ALL=POSIX (as recommended in your document) and the
> "." was still being ignored.
Hmm... Works for me. Please double check everything.
$ touch .baz .foo bar baz foa foo foz
$ LC_ALL=en_US.UTF-8 ls -A1
bar
baz
.baz
foa
foo
.foo
foz
$ LC_ALL=C ls -A1
.baz
.foo
bar
baz
foa
foo
foz
> So I issued the above export commands and (magically) data was sorted
> as data ..
Oh good.
> Thank you very much for your clarification.
Glad to help,
Bob