[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] outstanding patches

From: Robert Elz
Subject: Re: [Nmh-workers] outstanding patches
Date: Wed, 25 May 2005 12:52:44 +0700

    Date:        Tue, 24 May 2005 10:52:17 +0200
    From:        Oliver Kiddle <address@hidden>
    Message-ID:  <address@hidden>

  | The UNIX programming FAQ suggests that it is probably unwise to use
  | vfork() at all 

That's certainly true, at least unless you know what you're doing, and
the implementation is sane.

On the other hand the "small performance gain" from vfork() is
way understating its effects.  Even with COW (which the vfork()
detractors suggest "solves" the problem), vfork() done correctly
is (and must be) way more efficient than any variant of fork() can
possibly be.

Whether that's needed for any particular application is always
questionable, but pretending that a COW fork solves the problem that
vfork() was invented to handle just shows ignorance.   A spawn() call,
as Mike suggested, would handle most of it - except that it isn't easy
to come up with a good design for spawn() that is both rationally
possible to use, and really handles all that various applications need
(that is, if you can't use it in /bin/sh it isn't good enough, if you
can use it in /bin/sh it is probably too complex for almost anywhere else).

There were reasons for the original fork/exec split in unix, and they
haven't gone away over time (like most of the original unix design, it is all
very well thought out).   vfork() was clearly a hack, but as we're still
using it 25 years later, it has clearly been demonstrated to be both a
useful, and even perhaps, necessary, hack - and one that also had
quite a bit of thought behind its design.

I don't know linux at all, but from what I have seen about the problem
here, it looks as if linux doesn't really have vfork() at all, but
some abomination that they happen to call vfork() (of course, for linux,
that sounds pretty typical).

One of the *requirements* for vfork() - the thing that actually makes
the big difference, is that the parent process cannot continue until after
the child process has called exec() (one of the exec family) or exit().
By definition, that means that "errno smashing" cannot happen, as the
child half of the pair must be gone before the parent half returns
from the vfork() sys call - sys calls are always allowed to alter errno,
so the parent cannot be depending upon its value being unchanged around
a vfork() call (only that it will have a meaningful value when vfork()
returns -1).

Similarly, vfork() implemented correctly cannot possibly be used to imitate
threads, for the same reason - there's only ever one process runnable
with vfork(), never both parent and child (until the child has completely
replaced its image).

It sounds to me as if vfork() on linux has suffered the "we can do better
than that" mentality of much of linux, where "better" is based upon a
comparison by people who have no idea what they're doing better than.
To me, it sounds as if both processes continue running with shared data
space.   No wonder code has bugs, if it was written for real vfork() and
ends up on an abomination like that (also no wonder that people don't
see much of an improvement over fork() - a vfork() with two running
processes has to do almost all the same work that a COW fork() has to
do, unlike "real" vfork() that simply loaned all the kernel data structs
of the parent process to the child, then takes them back again later).

  | So how about we change all the instances to fork()? If it causes
  | problems for some system then presumably we'll find out.

It should not.   vfork() is supposed to be fork() with many restrictions
on what happens in the child process.   If changing back to fork() hurts
in any way other than making the processes slower (even to the point of
unusability) then the application is broken and needs to be fixed.
The shared memory semantic of vfork() is a limitation to be aware of,
not a feature to be exploited - even in the early days, there was no
guarantee that vfork() would actually share memory - I used systems that
had a vfork() call, but which internally simply implemented it as fork().

None of the (n)mh programs should be big enough, or fork often enough,
that the extra cost of fork() should matter enough for anyone to care
(not any any modern systems anyway), so switching to fork() everywhere
certainly seems reasonable to me - certainly at least, all of the calls
to vfork() should be audited to make sure that they are correct - a

        if (vfork() == 0)
                exec() , exit();

type sequence should be OK on any system.

Lastly, while I'm here, I'd like to second Mike's other comment.
Autoconf is an abomination.   There's no rational reason for nmh
to require anything like that, it isn't doing anything nearly that
system dependent, or shouldn't be.   Just write portable code.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]