Re: coreutils & building w/C++

coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: coreutils & building w/C++

From:	L A Walsh
Subject:	Re: coreutils & building w/C++
Date:	Sat, 04 Feb 2017 14:29:40 -0800
User-agent:	Thunderbird

Eric Blake wrote:

On 02/03/2017 07:31 PM, L A Walsh wrote:
I was wondering if there has ever been any consideration given tomigrating the coreutils to C++, or at least making it such thatit would build cleanly under either?
No, and there probably is no interest in it either. Findinga standards-compliant C++ compiler on as many platforms as wealready have a working C compiler is unlikely to happen, but wouldbe a necessary prerequisite.

-----

Hmmm... Seems like the gnu C++ compiler shares alot of C'sinfrastructure -- even the same man pages. It certainly seemslikely that if gcc runs on a given platform, g++ would as well.

Sorry, but I don't see any reason to rewrite in a differentlanguage. Most of the core contributors are more familiar withC than with C++,

-----

That's good since C++ is mostly a superset of C, so no rewritingshould be necessary. Usually, I find most standards-compliantC programs will run with few changes. C++ does disallow somedubious C constructs, but C++'s runtime includes all of C's libraryfunctions. Given C++ was designed to be C compatible, I certainlydon't see any _need_ for rewriting unless one wants to refactor orimprove the code.

and so even if C++ were used, it would look a lot more like weirdC than it would like proper C++.

-----

Well, "proper" is usually an academic matter (unless it becomes"vehemently proper", then religion may be involved). ;-)Many of the C++ programs I've seen or worked with look likestandard C programs, with the users wanting specific C++ features,like its standard library or allowing namespaces solely for thepurpose of keeping library or module calls in their own space and noimpinging on the global name space.

I also don't buy the argument that being more object-oriented willhelp things; coreutils is not a large corpus of multipleinter-related pieces, but a bunch of individual utilities that doone thing.

----

Really? Looking at the source, I see 316 source files (.c or .h),with 69,000 lines in a shared library, with only 30 files and 9000lines of separate, tool-frontends in src. In fact, it seems thatall of the core utils can be built as 1 binary with all code sharedand individual tools being symlinks to the 1 binary. There looks tobe over 7x as much common code in the library as there is codesupporting the different tool interfaces.

Looks like Gcc and binutils are better projects for using C++,and it shows, as both of them have already made progress towardsthat front.

----

I don't know how long it's been going on, but it looks like it isan ongoing project to find the commonalities in the core utils andmerge them with the common code going into the library.There may be a score or more of individual tools, but many of themhandle the exact same or similar issues -- like file-tree traversal,date+time manipulation+formatting, file access. With changes beingadded to core utils to support various security needs, even morecommonality between tools is growing as they are all getting theframework to deal with more specific end cases -- and that's stuffthat C++ can be very useful in organizing.I was thinking of how much more flexible coreutils would be if itwas more organized like the linux-kernel, in so much that different,orthogonal features can be developed in loadable modules. Theycould conceivably be organized along the lines of the kernel'ssecurity modules where only modules that are wanted/needed/usedwould be loaded at runtime -- OR such modules can be hard-linked inwith the kernel at build time, either limiting loadables to lesserused features, or none at all.

No, no one's ever been interested enough to even bother with it,because it is probably not worth doing.

----

One could look at history to find the correlation betweensomething not being done and whether or not it was worth it. Mostdominant people didn't think it was worth trying to circumnavigatethe globe because it wasn't worth it to build a ship you knew wouldfall off the edge of the flat world. Many thought flying wasimpossible or couldn't be done, as was breaking the speed of sound.History is filled with more examples of "undone things" being usefulwhen completed than not. Benefits of a particular course of actionare usually not visible before the action is done.

FWIW, I gave it a spin, and there seem to be several "__THROW"keywords. I'm not familiar with those in the C-standard.
That's because they're not in the C standard. That particularmacro is defined by glibc:misc/sys/cdefs.h:# define __THROW __attribute__((__nothrow__ __LEAF))as an optimization hint for gcc, and as a no-op for othercompilers.

-----

Wow... it was gcc that gave the errors (though w/differentswitches than are normally used in build.

But this is all open source - if you are questioning what the codemeans, rather than following the source and finding the answeryourself, then you're already facing an uphill battle at trying torewrite the source.

-----

Especially in areas with non-standard API's or formats. In thiscase, I was looking at the "low-hanging fruit" that gcc claimed itdidn't recognize that was responsible for multiple errors.C++ has standard language features to control error propagationand allow related optimizations. It's a good example of where C++would be more useful in that it has standard methods and features inareas where "ad-hoc" methods need to be used in C.

With C++, those features could be looked up in a C++ language orlibrary reference.Perhaps it might, at least, be possible to use the C++ compiler astype of diagnostic -- one that could help clarify existing C codeand make it more robust.For example in C, one can initialize a character array like:char b32str[32] ="ABCDEFGHIJKLMNOPQRSTUVWXYZ234567";used in "base32.c". However, depending on how it is used, it cancause problems:


 Compiled with:

 > gcc -Wall -o b32 b32.c

-----

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char ** argv) {
 static const char b32str[32]  = "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567";
 const int bufsiz              = sizeof(b32str)*3/2;
 char * buff                   = calloc(bufsiz, 1);
 memset(buff, 'X', bufsiz-1);
 strncpy(buff, b32str, sizeof(b32str) );
 printf("Len of b32str, '%s', is %d characters.\n",
   buff, (int) strlen(buff));
 exit (0);
}

-----

 When run, it produces:

Len of b32str, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567XXXXXXXXXXXXXXX', is 47 
characters.

 Under C, it produces no warnings as C allows the NUL terminator to

be dropped from the initialization if the initializer exactly fits --a 'valid', but potential problem if the expression is treated as the

same type as the initializer (a string).

 C++, generates an error:

> g++ -Wall -o b32 b32.cb32.c: In function ‘int main(int, char**)’:

 b32.c:7:33: error: initializer-string for array of chars is too long 
[-fpermissive]
   static const char b32str[32] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567";
                                  ^
 b32.c:9:32: error: invalid conversion from ‘void*’ to ‘char*’ [-fpermissive]
   char * buff = calloc(bufsiz, 1);
                               ^

 So, it seems it might be of some benefit in promoting safer programming
practices (or not, as people often find ways to work around restrictions!)

;-)

[Prev in Thread]

Current Thread

[Next in Thread]

coreutils & building w/C++, L A Walsh, 2017/02/03
- Re: coreutils & building w/C++, Eric Blake, 2017/02/04
  - Re: coreutils & building w/C++, L A Walsh <=

Prev by Date: Re: coreutils & building w/C++
Next by Date: [PATCH] doc: minor fix
Previous by thread: Re: coreutils & building w/C++
Next by thread: [PATCH] doc: minor fix
Index(es):
- Date
- Thread