bug-gnu-pspp
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PSPP-BUG: PSPP Version 079 - SPS with German Characters-Possible Bug


From: John Darrington
Subject: Re: PSPP-BUG: PSPP Version 079 - SPS with German Characters-Possible Bug
Date: Mon, 7 May 2012 08:12:40 +0000
User-agent: Mutt/1.5.18 (2008-05-17)

This problem is now fixed in the git "master".

Thank you for reporting it.

J'


On Sun, May 06, 2012 at 12:46:32PM +0200, ajk-eis wrote:
     After further investigation here:
     
     First of all for clarity, the problem only exists with the 0.7.9.xxx GUI
     (not the console) when loading a sps file that contains national 
characters.
     
     1. Portable Files (por) generated from the console can't work since the
     character set definition inside the file is ASCII, even though the file
     itself is Unicode or UTF-8 and still contains the correct national
     characters in the data.  Instead of replacing the unsupported characters
     (with "?") apparently the rest of the value is truncated.  The TYPE
     parameter is unsupported (see below).
     
     2. There is a WORKAROUND for the time being - 
     Using the console the sps file can be successfully loaded without any
     problems as John has indicated. Using the PSPP console command SAVE a 
system
     file (sav) can be generated.  This system file (sav) can be loaded into the
     GUI without any problems and all characters, values and labels are loaded
     and displayed correctly.
     
     It all seems strange since the same sps file loads correctly into the GUI 
of
     0.7.8.  I would also expect the 0.7.9 GUI to use the same internal modules
     to parse and load a file as the 0.7.9 console does.
     
     PS - According to the documentation for EXPORT (9.2) the /TYPE parameter is
     used to specify the character set of the por file but this is not (yet)
     implemented. The documentation is a bit misleading in this point since the
     sample syntax shows /TYPE={COMM,TAPE} which are devices where the
     description indicates this parameter could be used for character set?
     
     Alle
     
     > -----Urspr?ngliche Nachricht-----
     > Von: ajk-eis [mailto:address@hidden
     > Gesendet: Sonntag, 6. Mai 2012 00:48
     > An: 'John Darrington'
     > Cc: 'address@hidden'; 'jeff'
     > Betreff: AW: PSPP-BUG: PSPP Version 079 - SPS with German Characters-
     > Possible Bug
     > 
     > OK, I tried the console here too.  Using the console the file is read in
     > correctly and can be exported (correctly) into a portable file (por).
     > 
     > I can successfully read the por file into the UI (meaning without errors
     > being written into the output window), HOWEVER all of the strings
     > (variable labels and values) are cut off at the first occurrence of a
     > national character!  So actually I have a lot of import errors that are
     > not being logged or shown in the UI output.
     > 
     > This means "ausgew?hlt" becomes "ausgew". "ueberfl?ssig" becomes
     > "ueberfl". And so on.
     > 
     > So this really seems to be the UI that can't handle the national
     > characters.  Weird.  I hope and assume that you are seeing the actual
     > characters there.  They are not showing up on the web site but instead 
are
     > replaced by "?" (ASCII instead of UTF-8)
     > 
     > I've included the por file that I created from the console.  Trouble is,
     > my users can and will not use the console ....  This is not an
     > "emergency".  We will be going live around July.
     > 
     > Alle
     > 
     > > -----Urspr?ngliche Nachricht-----
     > > Von: John Darrington [mailto:address@hidden
     > > Gesendet: Samstag, 5. Mai 2012 23:41
     > > An: ajk-eis
     > > Cc: 'John Darrington'; address@hidden; 'jeff'
     > > Betreff: Re: PSPP-BUG: PSPP Version 079 - SPS with German Characters-
     > > Possible Bug
     > >
     > > It seems that there is indeed a bug.
     > >
     > > We'll look into it.
     > >
     > > J'
     > >
     > > On Sat, May 05, 2012 at 08:59:56PM +0200, ajk-eis wrote:
     > >      Thanks for bearing with us on the weekend!
     > >
     > >      >> that?).  I had to move the lines BEGIN DATA and all following
     > > directly
     > >      >> after DATA LIST.  However this shouldn't have caused problems
     > > reading ...
     > >      Moving that section behind DATA LIST did not really make a big
     > > difference
     > >      other than allowing the Data to be read in before the errors
     > starting
     > >      occurring. (See attachments) Also the "Bad character U+FFFD" does
     > not
     > > show
     > >      up at the moment.  However we have no problem reading files in 
with
     > > "our"
     > >      ordering, as long as there are no national characters.
     > >
     > >      >> certainly shouldn't cause a crash.
     > >      The last stable release 2012-03-15 does not crash.  The RCs 03-20
     > and
     > > 04-20
     > >      did (on running the old file with the data at the bottom AND
     > national
     > >      characters.
     > >
     > >      >> Anyway I suggest you make this correction and see how you go.
     > >      I just changed our generator to move the data up to the top. A new
     > > sps file
     > >      is included with a new Output PDF. All I use is the GUI in Win 7
     > (no
     > >      console). Since the data is at the top it is read in correctly
     > (there
     > > are no
     > >      national characters before it anymore). I still get many errors 
and
     > > in fact
     > >      the displayed text in the Output Window is NOT the same as in the
     > sps
     > > file
     > >      (look at the line with the variable label q470c475 - it should 
read
     > > as the
     > >      line above "Wie finden ...." - but it doesn't). This is static
     > text.
     > > The
     > >      beginning of each of these labels q470c472 through q470c476
     > > originates with
     > >      the exact same string.  The problem occurs due to the word
     > "h??lich"
     > > in line
     > >      q470c474 which has two national characters.
     > >
     > >      The problem continues at the line "value labels q470c475" which
     > > apparently
     > >      actually reads "val u e l abel s q470c475".  This to me is amazing
     > > (can't
     > >      see it in the PDF but copy it out into something else) but again 
is
     > > probably
     > >      due to the national characters in the previous lines?  In fact the
     > > "Nf" from
     > >      the variable label shows up again ("valNf").
     > >
     > >      Alle
     > >
     > >      > -----Urspr?ngliche Nachricht-----
     > >      > Von: John Darrington [mailto:address@hidden
     > >      > Gesendet: Samstag, 5. Mai 2012 19:15
     > >      > An: ajk-eis
     > >      > Cc: address@hidden
     > >      > Betreff: Re: PSPP-BUG: PSPP Version 079 - SPS with German
     > > Characters -
     > >      > Possible Bug
     > >      >
     > >      > Thanks for sending these files, and thanks for the bug report.
     > >      >
     > >      > I'm not seeing the problems that you are seeing, although not
     > >      > everything is quite right.
     > >      >
     > >      > There is a small problem with your syntax:  You have the 
VARIABLE
     > > LABELS,
     > >      > MISSING VALUES and VALUE LABELS commands situated between the
     > DATA
     > > LIST
     > >      > and
     > >      > BEGIN DATA commands.  I don't think this is allowed in PSPP 
(does
     > > SPSS
     > >      > allow
     > >      > that?).  I had to move the lines BEGIN DATA and all following
     > > directly
     > >      > after
     > >      > DATA LIST.  However this shouldn't have caused problems reading
     > the
     > > file,
     > >      > and
     > >      > certainly shouldn't cause a crash.
     > >      >
     > >      > When I make this change, everything works fine using the
     > terminal,
     > >      > although
     > >      > there is a problem with the GUI.
     > >      >
     > >      > Anyway I suggest you make this correction and see how you go.
     > >      >
     > >      > J'
     > >      >
     > >      >
     > >      > On Sat, May 05, 2012 at 05:43:29PM +0200, ajk-eis wrote:
     > >      >      Hello John, thanks for the quick reply and moving this to
     > > bugs.
     > >      >
     > >      >      Attached are 4 files.
     > >      >      1 - umfrage-7-Errors.sps - This file is UTF-8 encoded and
     > > contains
     > >      > German national characters (if this e-mail text is Unicode - ??,
     > > ??, ??).
     > >      >      (U+00E4, U+00F6, U+00FC, U+00DF)
     > >      >      These are not all the possibilities but are enough to test
     > > with. This
     > >      > file will not run here and produces consistent import errors. 
Not
     > >      > contained but very popular in Europe is also the Euro Sign (??? 
-
     > > U+20AC)
     > >      > which will also create the errors.
     > >      >
     > >      >      2 - PSPPIRE Output Viewer_7_Errors.pdf - This is the output
     > of
     > > PSPP
     > >      > with the file above (with national characters) which is not
     > > imported.
     > >      >
     > >      >      3 - umfrage-7a-OK.sps - This file is exactly the same
     > > generated file
     > >      > but we have replaced the occurrences of the 4 characters above
     > with
     > > the
     > >      > "denglish" equivalents (i.e. ae, oe, ue, ss). IOW there are only
     > > ASCII
     > >      > characters contained. This file will run without errors.
     > >      >
     > >      >      4 - PSPPIRE Output Viewer_7a_OK.pdf - This is the output of
     > > PSPP with
     > >      > the file in 3 above, without errors and successfully imported.
     > >      >
     > >      >      I hope that you can reproduce the problem there with file 
1.
     > >      >
     > >      >      Alle
     > >      >
     > >      >      > -----Urspr??ngliche Nachricht-----
     > >      >      > Von: John Darrington [mailto:address@hidden
     > >      >      > Gesendet: Samstag, 5. Mai 2012 14:58
     > >      >      > An: ajk-eis
     > >      >      > Cc: address@hidden; 'jeff'
     > >      >      > Betreff: Re: PSPP Version 079 - SPS with German 
Characters
     > -
     > >      > Possible Bug
     > >      >      >
     > >      >      > [Moving the thread to address@hidden
     > >      >      >
     > >      >      > I wasn't aware of this.
     > >      >      >
     > >      >      > Please do send examples.
     > >      >      >
     > >      >      > J'
     > >      >      >
     > >      >      > On Sat, May 05, 2012 at 12:51:11PM +0200, ajk-eis wrote:
     > >      >      >      Installation - PSPP build pspp-079-20120315-32bits
     > >      >      >      OS - Win 7 Pro (up to date) and UBUNTU
     > >      >      >
     > >      >      >      I assume you already know this but in case you don't
     > we
     > > are
     > >      > not able
     > >      >      > to import / load any sps file that contains any special
     > > German
     > >      > characters
     > >      >      > in the values. The output is "bad character U+FFFD". The
     > > same
     > >      > occurs with
     > >      >      > the Euro Symbol ("???"). This occurs on a Windows 7
     > > installation
     > >      > and a
     > >      >      > LINUX installation. These are valid UTF8 files.  This
     > would
     > > appear
     > >      > to be a
     > >      >      > regression since the import of the same files works with
     > > 0.7.8.xxx.
     > >      >      >
     > >      >      >      We have tried creating the sps file with all
     > imaginable
     > >      > encodings
     > >      >      > (i.e. UTF 8, UTF 16, CP 1252 (which is standard for local
     > >      > installation),
     > >      >      > etc.) with no resolution of the problem. At the moment we
     > > must use
     > >      > ASCII
     > >      >      > encoding which has no national characters or replace all
     > > special
     > >      > German
     > >      >      > characters prior to the import.
     > >      >      >
     > >      >      >      The RC 2012-03-20 and 2012-04-11, installed on
     > Windows
     > > 7,
     > >      > crash when
     > >      >      > the import file is run.
     > >      >      >
     > >      >      >      In the PSPP Version pspp-master-20111111 (0.7.8.xxx)
     > > the
     > >      > import works
     > >      >      > without problems but this version is not stable on a
     > > (German)
     > >      > Windows 7
     > >      >      > installation.
     > >      >      >
     > >      >      >      I can supply example files if desired.
     > >      >      >
     > >      >      >      Cheers
     > >      >      >      Alle
     > >      >      >
     > >      >      >
     > >      >      >      _______________________________________________
     > >      >      >      Pspp-users mailing list
     > >      >      >      address@hidden
     > >      >      >      https://lists.gnu.org/mailman/listinfo/pspp-users
     > >      >      >
     > >      >      > --
     > >      >      > PGP Public key ID: 1024D/2DE827B3
     > >      >      > fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C
     > 2DE8
     > > 27B3
     > >      >      > See http://keys.gnupg.net or any PGP keyserver for public
     > > key.
     > >      >
     > >      >
     > >      >
     > >      >
     > >      >
     > >      >
     > >      >      _______________________________________________
     > >      >      Bug-gnu-pspp mailing list
     > >      >      address@hidden
     > >      >      https://lists.gnu.org/mailman/listinfo/bug-gnu-pspp
     > >      >
     > >      >
     > >      > --
     > >      > PGP Public key ID: 1024D/2DE827B3
     > >      > fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
     > >      > See http://keys.gnupg.net or any PGP keyserver for public key.
     > >
     > >
     > >
     > >
     > >
     > > --
     > > PGP Public key ID: 1024D/2DE827B3
     > > fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
     > > See http://keys.gnupg.net or any PGP keyserver for public key.
     

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://keys.gnupg.net or any PGP keyserver for public key.

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]