RE: $(sort) - what is "lexical order"? (was RE: Follow-up)

From: Martin Dorey
Subject: RE: $(sort) - what is "lexical order"? (was RE: Follow-up)
Date: Mon, 18 Jul 2011 16:56:53 -0700

If I had check-in privs, I'd at least make this issue explicit in the documentation.  Even that, though, will require others on the list to be persuaded.


> Maybe $(alphabetize <list>)?


Creative, but the name doesn't really work for writing systems, like Mandarin written in hanzi, without an alphabet.


Did you realize that your makefile can delegate the sorting to sort(1)?  You wouldn't want Little Bobby Tables using this but who uses $(sort) on lists containing shell meta-characters like quotes?


address@hidden:~$ { echo 'strcoll = $(shell echo "$(1)" | fmt -w1 | sort)'; echo 'L:=$(call strcoll,B a)'; } | make -f - -p 2>&1 | grep '^L '

L := a B



From: Rob Holbert [mailto:address@hidden
Sent: Sunday, July 17, 2011 10:13
To: Martin Dorey
Subject: Re: $(sort) - what is "lexical order"? (was RE: Follow-up)


I contend that the only useful purpose for the sort function is to alphabetize a list of items correctly. I realize that the out the box c strcmp function doesn't give us what we want exactly. The simple and obvious solution to sort the alphabet correctly in the ASCII world would be to put all strings in the same case prior to the comparison. Maybe $(alphabetize <list>)? Just don't see any real use for the quasi-sort that presently exists. Why would you want to almost alphabetize a list of files or words? It's like a tease. lol.



On Tue, Jul 12, 2011 at 5:17 PM, Martin Dorey <address@hidden> wrote:

OP has something of a point: contrast the locale-dependent behavior of sort(1) with make's $(sort):


$ echo 'L:=$(sort B a)' | make -f - -p 2>&1 | grep '^L '

L := B a

$ { echo B; echo a; } | sort



$ { echo B; echo a; } | LC_ALL=C sort





I present this more to provoke "we can't change that!" and clarified documentation than as a serious suggestion:


Index: configure.in


RCS file: /sources/make/make/configure.in,v

retrieving revision 1.157

diff -u -r1.157 configure.in

--- configure.in    29 Aug 2010 23:05:27 -0000       1.157

+++ configure.in 12 Jul 2011 21:11:28 -0000

@@ -166,6 +167,7 @@

 AC_CHECK_FUNCS(strcasecmp strncasecmp strcmpi strncmpi stricmp strnicmp)


 # strcoll() is used by the GNU glob library

+# and by $(sort)




Index: misc.c


RCS file: /sources/make/make/misc.c,v

retrieving revision 1.84

diff -u -r1.84 misc.c

--- misc.c          6 Nov 2010 21:56:24 -0000          1.84

+++ misc.c       12 Jul 2011 21:11:28 -0000

@@ -51,6 +51,10 @@

 # define VA_END(args)



+#if !defined(HAVE_STRCOLL)

+# define strcoll strcmp




 /* Compare strings *S1 and *S2.

    Return negative if the first is less, positive if it is greater,

@@ -62,9 +66,7 @@

   const char *s1 = *((char **)v1);

   const char *s2 = *((char **)v2);


-  if (*s1 != *s2)

-    return *s1 - *s2;

-  return strcmp (s1, s2);

+  return strcoll (s1, s2);



 /* Discard each backslash-newline combination from LINE.

cvs diff: Diffing config

cvs diff: Diffing doc

Index: doc/make.texi


RCS file: /sources/make/make/doc/make.texi,v

retrieving revision 1.72

diff -u -r1.72 make.texi

--- doc/make.texi            2 May 2011 15:11:23 -0000         1.72

+++ doc/make.texi         12 Jul 2011 21:11:28 -0000

@@ -6846,6 +6846,8 @@



 returns the value @samp{bar foo lose}.

+In a change from previous versions, make now sorts in locale-dependent order.

+Run with LC_ALL=C in the environment to select the previous behavior.


 @cindex removing duplicate words

 @cindex duplicate words, removing


From: bug-make-bounces+mdorey=bluearc.com@gnu.org [mailto:bug-make-bounces+mdorey=bluearc.com@gnu.org] On Behalf Of Rob Holbert
Sent: Monday, July 11, 2011 12:24
To: address@hidden
Subject: Follow-up


Wanted to followup to my earlier email. Attached is the smallest makefile I could create to demonsterate the issue.




#does not sort lexically like expected
LIST = $(sort widget.c main.c ad.c Buzzer.c)

all: list

    @echo $(LIST)

.PHONY: all list


Previous email:



I ran across perhaps a bug or need for another feature at least. If a list of items has words beginning with both upper and lower case letters, the resulting $(sort $(LIST)) will result in all capital letter words coming before the lower case words. In this case, Zebra.c would appear before apple.c. This is dictated by the ASCII chart of course. However, it is not lexical order as the manual explains the function is. Lexical would be apple.c Zebra.c.


This is solved easily by making the sort comparison convert all alphas to lower case before comparing, leaving the original string case unchanged. 


I like to use sort to put my sources in order. This way it is easier to see if an object file is missing for instance.


Best Regards,



