bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#26422: historical feature or grand daddy bug?


From: Kyle Sallee
Subject: bug#26422: historical feature or grand daddy bug?
Date: Mon, 10 Apr 2017 11:01:40 -0700

On Sun, Apr 9, 2017 at 10:59 PM, L A Walsh <address@hidden> wrote:

>    Anytime you have multiple blank lines in a row,
> you have consecutive line feeds.
>

For typical sort processed data;
concurrent LF might be uncommon.
When the event does not become
then by the specialized code
the CPU cycles could be wasted.


> ----
>    I'm sure if you submitted a working patch + documentation
> + rights assigned to GNU, and first born child given to FSF,
> the coreutil maintainers would consider it.
>

Past self authored cat patches were declined.  :(
For self only desired modifications;
self authored software gains immediate approval.  :)

>From not parsing sort source code;
accidental source copy is mitigated
and the boons and banes can not be inherited.
>From dissenting output the question became;
"Why LF before TAB?"

Program sort's performance is fine.
A complaint did not exist.

In all executed software
the same performance bane exists.

A fork + execve overhead or
a relevant functions + posix_spawnp overhead exists.
Or more succinctly put
for program start CPU cycles are required.

The overhead might seem insignificant,
but in a script for each program launch
the program reliance overhead accumulates.
Performance is lost.

In contrast to program provided code;
library provided code can be
loaded once and used frequently.
As compared to program invocation duration
the library load duration also is less.

To sort a small line amount
by program sort invocation
a considerable program launch
overhead duration becomes.

>From a library perspective the following
potentially complex tasks seem attractive:
cp; mv; sort; tsort; wc.
cp and mv implementations can be surprisingly complex.
Self authored implementations already exist.

For coreutils provided cp and mv;
parameter options that from the kernel cache
purge used file data could be useful.
>From files that were copied or moved
the content is probably not again immediately useful
yet lingers in the kernel cache.
By the kernel cache
when almost all available RAM is used
then file copy performance tanks.
By a large and irrelevantly stocked kernel cache search
performance also tanks.

A less than ideally configured in use kernel seems plausible.
For this task perhaps .../vfs_cache_pressure and
.../drop_caches might not suffice?

Function posix_fadvise seems useful.
But on descriptors posix_fadvise must be invoked.
For directory cache data, however,
posix_fadvise does not seem useful.

Thanks again for maintaining and sharing coreutils.
For coreutils if a library interface existed
then it would gain use.

P.S.
If the Linux kernel provided sendfile function
became POSIX approved,
then
http://lists.gnu.org/archive/html/bug-fileutils/2003-03/msg00030.html
might merit reconsideration.
For systems where mass storage device data
throughput is not bottle-necked
or where a cached conclusion suffices
then by user space buffer omission;
significant CPU cycles can be conserved.
When between descriptors; data must be transferred
the sendfile function is useful.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]