pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bug#760841: pspp: random testsuite failures


From: Ben Pfaff
Subject: Re: Bug#760841: pspp: random testsuite failures
Date: Tue, 9 Sep 2014 08:53:53 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Sep 08, 2014 at 02:49:46PM +0100, Steven Chamberlain wrote:
> retitle 760841 pspp: random testsuite failures
> thanks
> 
> Affects GNU/Linux too!  It's odd the issue has only been seen yet on
> kfreebsd rather than linux buildds;  I reproduced it on linux-amd64, and
> with similar ~2% probability:
> 
> > ~/pspp-0.8.3$ for i in $(seq 1 1000) ; do printf 
> > 'e\0n\0t\0r\0\351\0e\0\n\0' | tests/libpspp/u8-istream-test read - Auto | 
> > xxd ; done | sort | uniq -c | sort -bn
> >      21 0000000: 6500 6e00 7400 7200 c3a9 0065 000a 00    e.n.t.r....e...
> >     979 0000000: 656e 7472 c3a9 650a                      entr..e.
> 
> For sanity, I've checked that printf is working correctly:
> 
> > ~/pspp-0.8.3$ for i in $(seq 1 1000) ; do printf 
> > 'e\0n\0t\0r\0\351\0e\0\n\0' | xxd ; done | sort | uniq -c | sort -bn
> >    1000 0000000: 6500 6e00 7400 7200 e900 6500 0a00       e.n.t.r...e...
> 
> So it may be related to how the tests/libpspp/u8-istream-test program
> does its reads from stdin?  I'm also curious why the read is truncated
> only sometimes - possibly a glibc issue there?

Thanks for the analysis.

This is a PSPP bug in the library that u8-istream-test uses.  I pushed
the following fix to the PSPP upstream repository.  Friedrich, do you
want to apply that to the PSPP packaging?  I'll sponsor the upload for
you.

Thanks,

Ben.

--8<--------------------------cut here-------------------------->8--

From: Ben Pfaff <address@hidden>
Date: Tue, 9 Sep 2014 08:51:44 -0700
Subject: [PATCH] u8-istream: Fix handling of partial reads.

The u8-istream code did not retry upon a partial read, assuming that that
was the end of the file.  When the partial read was shorter than
ENCODING_GUESS_MIN, this could cause the encoding guesser, in turn, to
guess the wrong encoding (especially if the encoding was really UTF-16 and
the partial read was an odd number of bytes).

Reported at https://bugs.debian.org/760841.
Reported by Friedrich Beckmann and Steven Chamberlain.
---
 src/libpspp/u8-istream.c    | 20 ++++++++++++++------
 tests/libpspp/u8-istream.at |  4 +++-
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/src/libpspp/u8-istream.c b/src/libpspp/u8-istream.c
index 2486ca1..6347a67 100644
--- a/src/libpspp/u8-istream.c
+++ b/src/libpspp/u8-istream.c
@@ -1,5 +1,5 @@
 /* PSPP - a program for statistical analysis.
-   Copyright (C) 2010, 2011, 2012, 2013 Free Software Foundation, Inc.
+   Copyright (C) 2010, 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
 
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -203,14 +203,22 @@ fill_buffer (struct u8_istream *is)
   is->head = is->buffer;
 
   /* Read more input. */
+  n = 0;
   do
     {
-      n = read (is->fd, is->buffer + is->length,
-                U8_ISTREAM_BUFFER_SIZE - is->length);
+      ssize_t retval = read (is->fd, is->buffer + is->length,
+                             U8_ISTREAM_BUFFER_SIZE - is->length);
+      if (retval > 0)
+        {
+          n += retval;
+          is->length += retval;
+        }
+      else if (retval == 0)
+        return n;
+      else if (errno != EINTR)
+        return n > 0 ? n : -1;
     }
-  while (n < 0 && errno == EINTR);
-  if (n > 0)
-    is->length += n;
+  while (is->length < U8_ISTREAM_BUFFER_SIZE);
   return n;
 }
 
diff --git a/tests/libpspp/u8-istream.at b/tests/libpspp/u8-istream.at
index 24af08c..5b5b4e0 100644
--- a/tests/libpspp/u8-istream.at
+++ b/tests/libpspp/u8-istream.at
@@ -140,7 +140,9 @@ AT_CLEANUP
 AT_SETUP([read UTF-16 as Auto])
 AT_KEYWORDS([u8_istream])
 AT_CHECK([i18n-test supports_encodings UTF-16 UTF-16BE UTF-16LE])
-AT_CHECK([printf '\0e\0n\0t\0r\0\351\0e\0\n' | u8-istream-test read - Auto],
+dnl The "sleep 1" checks for a bug in which u8-istream did not properly
+dnl handle receiving data in multiple chunks.
+AT_CHECK([{ printf '\0e\0n\0t\0'; sleep 1; printf 'r\0\351\0e\0\n'; } | 
u8-istream-test read - Auto],
   [0], [entr??e
 ])
 AT_CHECK([printf 'e\0n\0t\0r\0\351\0e\0\n\0' | u8-istream-test read - Auto],
-- 
1.9.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]