bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnulib] safe-read.[ch] (safe_read): what do you think?


From: Paul Eggert
Subject: Re: [Bug-gnulib] safe-read.[ch] (safe_read): what do you think?
Date: Thu, 21 Nov 2002 11:09:12 -0800

> From: Bruno Haible <address@hidden>
> Date: Thu, 21 Nov 2002 16:18:21 +0100 (CET)
> 
> For the read() system call, the code
> 
>      size_t nbytes = ...;
>      if (read (fd, buf, nbytes) != nbytes)
> 
> is indeed wrong, because if nbytes is > 2^31 and the read succeeds,
> read()'s return value will be
> 
>      (ssize_t) nbytes

We can't rely on that, since POSIX says that if nbytes is > SSIZE_MAX
then the result is implementation-defined.

I agree with Jim's change to safe-read.  I have been using a similar
function in diffutils called block_read, which was written in the
1980s (before safe_read, I think).  block_read also uses size_t rather
than ssize_t, and I've also found that this makes the code simpler and
cleaner.

Perhaps it would make sense to unify some of block_read and safe_read?
I just looked at both sets of code, and found the following differences:

 1. safe_read mishandles large reads on some hosts.  It attempts to
    read more than SSIZE_MAX bytes, which is a no-no as described
    above.  Also, it runs into a bug in Tru64 5.1, which can't read
    more than INT_MAX bytes at a time (and where INT_MAX < SSIZE_MAX).

 2. safe_read assumes that the invoker either has not trapped signals,
    or has arranged for them to be restarted if it is on a
    sufficiently-modern host.  Hence it need not test errno==EINTR
    when it is on a sufficiently-modern host.

 3. block_read is guaranteed to always return nbytes unless end-of-file
    or error; it retries if it gets a short read.  In contrast,
    safe_read returns whenever it successfully reads any bytes.

(1) is a no-brainer; it should be merged into safe-read.c.

(2) is merely an optimization.  Perhaps it could be conditionalized
(i.e., done only if config.h says to do it).  Or it could be left out
if it's too much of a pain.  I'm a big fan of SA_RESTART, and I think
all programs that trap signals should use SA_RESTART; but it's not
universal yet.

I found that (3) simplified the code in diffutils, so I don't want to
give up (3), and I'm willing to maintain a separate function to have (3).

I just looked at coreutils, and I think that most (but not all)
instances of safe_read could be replaced by block_read.  In some cases
(e.g. cat) this will improve performance a tad; in others (e.g. split)
it will simplify code (e.g. split's stdread is roughly equivalent to
block_read).

Perhaps we should add block_read to gnulib?  It could call safe-read
to do most of its work.  We would keep safe-read for the few places
where the code really needs partial reads.

The simplest would be to add new files block-read.h and block-read.c.
But perhaps we should establish a new .h file (readwrite.h?) to
contain the union of block-read.h, safe-read.h, and full-write.h?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]