bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: choice of implementation language


From: Micah Cowan
Subject: Re: choice of implementation language
Date: Tue, 06 Jan 2009 20:22:45 -0800
User-agent: Thunderbird 2.0.0.18 (X11/20081125)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruno Haible wrote:
> If gnulib-tool was to be rewritten in another programming language than
> shell + sed, what would be the good choices?
> 
> The foremost criteria IMO should be the maintainability, i.e. the ability for
> us and for new contributors to gnulib to master this programming language.
> To get an estimate of this, there are various sources of information.
> 
> 1) We can look at the number of developers who master one language or the
>    other. This matters because we cannot force or expect gnulib contributors
>    to learn a new programming language, just for gnulib-tool.
> 
>    I compared C, C++, Java, shell-script, Python, Perl in ohloh:
>    
> <http://www.ohloh.net/languages/compare?commit=Update&l0=c&l1=java&l2=perl&l3=python&l4=shell&l5=cpp&l6=-1&measure=contributors&percent=>
>    The result is the following ordered list:
>      1. C
>      2. Java
>      3. C++
>      4. Python
>      5. perl
>    The comparison by number of projects rather than by number of developers
>    
> <http://www.ohloh.net/languages/compare?commit=Update&l0=c&l1=java&l2=perl&l3=python&l4=shell&l5=cpp&l6=-1&measure=projects&percent=>
>    yields the same result.

I don't find a number-of-projects metric to be all that terribly useful,
versus number-of-developers. I don't think the one is an accurate
stand-in for the other.

I suspect that these results represent the GNU and Free Software
development communities quite poorly. I would definitely place Java and
C++ much lower down the list based on my own personal experience with
developers' abilities, for Free Software developers I've met.

I suspect Perl and Python have similarly-sized communities; Python seems
to be growing steadily, but Perl seems to have more history with GNU
Software in particular.

> 2) We can also look at the level of familiarity of the current gnulib-tool
>    maintainers with these languages. Among us recent contributors to 
> gnulib-tool
>    (Eric, Jim, Ralf, Simon, and me) two of us have made public their skills:
>    <http://savannah.gnu.org/people/resume.php?user_id=1389>
>    <http://savannah.gnu.org/people/resume.php?user_id=1871>
>    making up for:
>      C      - 2 x master/expert
>      Java   - 2 x master/expert
>      C++    - 1 x master/expert, 1 x good knowledge
>      Python - 1 x base knowledge
>      perl   - 1 x base knowledge
> 
>    Also, I know that a few years ago Paul did not know C++ and was not 
> inclined
>    to learn it.
> 
>    So according this criteria, only C and Java remain possibilities. Python 
> and
>    perl have to be excluded because too few of us are skilled in these 
> languages.

If one accepts that mastery of the language, as opposed to base
knowledge, is necessary. That is absolutely true for C and C++; but in
the case of Python, there isn't really a whole lot to know IME.

> 3) Long-term maintainability requires some degree of standardization, so
>    that the amount of expected future changes in the language and its runtime
>    library is small. This speaks in favour of C, Java, C++, and against
>    Python and perl.

I disagree wrt Python. Python has fairly thorough language and modules
specifications, and has multiple implementations in use. In addition, a
good level of future-"resistance" and backwards compatibility is
maintained, through things such as the "future" module.

And standardization seems to be a poor indication of
portability/language change. For instance, shell code is of course
standardized, and yet the existing standards often poorly represent
real-world implementations. Likewise, the latest C standard has been
around for about a decade now, and yet there are only one or two
implementations that conform well to it (gcc not being one of them;
though it implements most of the ones I tend to care about). Conversely,
Perl doesn't really have a "standard", and yet old code has continued to
be portable in newer language versions (this won't be true in 6.0, but
somehow I don't think that'll matter much).

> 4) For comparing simply the syntactic complexity of the languages (yes this
>    is only a small facet of maintainability, but nevertheless), one can take
>    the amount of code needed for writing a superficial parser. Such parsers
>    are implemented in gettext/gettext-tools/src/x-*.c, and x-perl.c is more
>    than twice as large as the other parsers. This indicates that also for a
>    human developer, perl syntax is harder to grok than the syntax of other
>    programming and scripting languages.

Again, I think that's a poor measurement. What is easy for humans to
write is often complex to parse. The ideal way to write software would
be to provide a set of text instructions, in English. That would
obviously require a very large amount of parser logic.

Also, I'm guessing those gettext things are lexers, and not grammar
parsers, as AFAICT gettext shouldn't need grammars, so they're not the
whole story.

In this case, I'd agree that Perl's is the most complex syntax; however,
syntax is a very small part of a language's complexity. Perl and Python
both offer quite a few high-level features that make the most common
programming tasks much, much easier. By contrast, C's standard libraries
are extremely low-level, and must be supplemented a great deal. Of
course, we have a lot of handy utilities in gnulib itself to mitigate
that a good deal...

Still, I'd say Python and Perl come to the fore in terms of
ease-of-implementation, followed by Java, and then C and C++ distantly.
In terms of how easy it would be for future maintainers to maintain
someone else's code, Python would probably be fore, followed by Java, C
and C++. Perl can be written so that it's easier to maintain than
easily-maintained C; it can also be written so that it's more difficult.
Unless care is taken, it's probably more difficult in general.

I'm somewhat surprised that "suitability to the task" got a miss. I
definitely agree with those that have suggested that a scripting
language makes a great deal of sense, particularly scripting languages
whose implementations are likely to already be present on the machine. I
suspect Perl may have a somewhat larger install base on Unixen than
Python, though AFAICT that gap is closing in newer installs.

If, as Mike suggests, string processing is a significant part of
gnulib-tool's task, then Perl again seems the winner here: despite the
fact that Java and Python have strong string-parsing _libraries_, Perl's
built-into-the-syntax regex and string-manipulation operators are a big win.

Obviously, the very biggest concern is what makes the most sense to
gnulib's current implementors/maintainers, followed by what will work
the best for future ones. Well, and ease-of-use to the user should be
way up there, too, and I consider a compiled language, or even worse, a
language with relatively poor install base on Unixen [Java], to have a
pretty severe impact on that (provided the alternatives are likely to
already be available on the system); but obviously no one can tell you
what you're most comfortable with.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklkLhQACgkQ7M8hyUobTrE6ygCZAd1M67rv31y+n7FCGWa7Y5ZH
w18AniTj5yvvDu/VNdIY0jKDq9imP9RB
=vYVO
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]