bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: code issues in sort.c


From: Paul Eggert
Subject: Re: code issues in sort.c
Date: Sun, 17 Jul 2005 01:05:03 -0700
User-agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux)

address@hidden (David Feuer) writes:

> Even the most basic question, "why does sort use a merge sort", is not
> answered.

Well, we had to use _something_.  :-)

Knuth vol. 3 is a good source of material for sorting algorithms, and
explains some of the pros and cons of merge sorting.  See:
<http://www-cs-faculty.stanford.edu/~knuth/taocp.html>

> My understanding is limited, but it looks like numbers may be
> recomputed for -g every time two lines are compared.

Yes, that's correct.

> Once these numbers are calculated, they should really be stored.

That would be a win in some cases but not others.  Nobody has had the
time to try it and measure the results for typical cases.

> default_sort_size does not give any meaningful explanation of the
> hard-coded constants it contains, or of much else for that matter.

Which hard-coded constants?  Can you suggest improved comments?

> sort_buffer_size has a comment saying, among other things, "Do not
> exceed a bound on the size: if the bound is not specified by the user,
> use a default."  This is less than illuminating.

How about this instead?  "Do not exceed the size bound specified by
the user (or a default size bound, if the user does not specify one)."

> Temporary files are not deleted immediately upon creation: is there a
> reason for this?

It's generally bad practice to have files that are not directory
entries, because it makes it harder for users to see where all the
space in their file system is going.  This is particularly important
when one's disk is full.

> I don't see documentation on what exactly the temporary files are _for_.

(That's Knuth vol. 3 again :-)

> It would be far easier to understand what is going on if it were
> broken up into pieces,

Quite possibly, but nobody has had the time to do that.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]