pspp-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pspp-commits] [SCM] GNU PSPP branch, master, updated. v0.6.1-1932-g9ade


From: Ben Pfaff
Subject: [Pspp-commits] [SCM] GNU PSPP branch, master, updated. v0.6.1-1932-g9ade26c
Date: Sun, 20 Mar 2011 16:56:36 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU PSPP".

The branch, master has been updated
       via  9ade26c8349b4434008c46cf09bc7473ec743972 (commit)
       via  afdf3096926b561f4e6511c10fcf73fc6796b9d2 (commit)
       via  75a467ed2d32e1adb0c24cf89676cfb48845be98 (commit)
       via  d3e294c031bb767336435d2f0048994103fcd47a (commit)
       via  f3668539947d5baed813a4f8436d6cf36abeedd2 (commit)
       via  c69c407c02121e63bdadf6efe55e4211abd03ad2 (commit)
       via  1b3322acf30d531cefe3cdbf7287ec8cde601bcd (commit)
       via  9d1d71e732eeed85ca3002b264e1269cdd005a3f (commit)
       via  f5099c58d17e8f66a74a84918e688ef17936d392 (commit)
       via  6d89701ab597b810da249ff0e4e42423e869df66 (commit)
       via  9bbbfbc94aead4518e17eb6304451f6ad2ca2db2 (commit)
       via  530906aaa19f6c209ca008c8187f7f750a0b1283 (commit)
       via  086322fd8c85a303ba6f552950d6f057f2867add (commit)
       via  687c1acdbeecd7d0d7fdc4143d444e8b1563b532 (commit)
       via  417bac514fb3de900cb12689d8668d4d30a82e3f (commit)
       via  d8fdf0b4fa919e48397b438e9453d6b82215ff51 (commit)
       via  ca0a72e321421d02a1fd6df943425eff4bd1a257 (commit)
       via  510366c9d99de028f0322e3df01bc813ec60099b (commit)
      from  c831ad10d7e9d494e5e22ab30306057e81bc52cd (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 9ade26c8349b4434008c46cf09bc7473ec743972
Author: Ben Pfaff <address@hidden>
Date:   Sat Mar 19 17:05:47 2011 -0700

    lexer: Reimplement for better testability and internationalization.
    
    This commit reimplements PSPP lexical analysis from the ground up.
    From a PSPP user's perspective, this should make PSPP more reliable
    and make it easier to work with syntax files in non-ASCII encodings.
    See the changes to NEWS for more details.
    
    From a developer's perspective, the most visible change may be that
    strings within tokens are now always encoded in UTF-8, regardless of
    the syntax file's encoding.  Many of the changes in this commit are
    due to this, especially those to functions that check for valid
    identifiers: an identifier in UTF-8 is not necessarily the same length
    when encoded in the dictionary's encoding, but limits on identifier
    length must be enforced in the dictionary's encoding (otherwise it
    might not be possible to write out a valid system file, since the
    identifier might not fit in the fixed length fields in such files).
    
    Another important change is that, whereas before some special syntax
    had to be handled by the parser providing feedback to the lexer, now
    increasing the sophistication of the lexer has enabled all PSPP syntax
    to be analyzed into tokens.  This permitted some other improvements:
    
      - An arbitrary number of tokens of lookahead, up to the end of the
        current command, is now supported using lex_next_token() and
        related functions.
    
      - Before, some command implementations had a special attribute that
        meant that the top-level PSPP command parser would not consume the
        final token of the command name (because that token was not
        followed by tokenizable syntax).  This is no longer necessary and
        has been removed.
    
      - Before, each command implementation was responsible for ensuring
        that valid command syntax was not followed by trailing garbage,
        often by calling lex_end_of_command() as the last step of parsing.
        This is no longer necessary; the main command parser will ensure
        this for itself.

commit afdf3096926b561f4e6511c10fcf73fc6796b9d2
Author: Ben Pfaff <address@hidden>
Date:   Sat Mar 19 16:32:16 2011 -0700

    scan: New library for high-level PSPP syntax lexical analysis.
    
    This library converts a stream of segments output by the "segment"
    library into PSPP tokens.

commit 75a467ed2d32e1adb0c24cf89676cfb48845be98
Author: Ben Pfaff <address@hidden>
Date:   Sat Mar 19 16:30:55 2011 -0700

    segment: New library for low-level phase of lexical syntax analysis.
    
    This library provides for a low-level part of lexical analysis for
    PSPP syntax, which I call "segmentation".  Segmentation accepts a
    stream of UTF-8 bytes as input.  It outputs a label (a segment type)
    for each byte or contiguous sequence of bytes in the input.
    
    The following commit will implement the high-level phase of lexical
    analysis, called "scanning", that converts a sequence of segments into
    PSPP tokens.

commit d3e294c031bb767336435d2f0048994103fcd47a
Author: Ben Pfaff <address@hidden>
Date:   Sat Mar 19 16:34:53 2011 -0700

    u8-istream: New library for reading a text file and recoding to UTF-8.
    
    This new library will be used in an upcoming commit.

commit f3668539947d5baed813a4f8436d6cf36abeedd2
Author: Ben Pfaff <address@hidden>
Date:   Sun Mar 20 09:43:42 2011 -0700

    encoding-guesser: New library to guess the encoding of a text file.
    
    This will be used by other new libraries in upcoming commits.

commit c69c407c02121e63bdadf6efe55e4211abd03ad2
Author: Ben Pfaff <address@hidden>
Date:   Sat Mar 19 16:20:44 2011 -0700

    i18n: New functions and data structure for obtaining encoding info.
    
    For now these functions don't do any caching, but it might sense to
    add caching later if they are called frequently.

commit 1b3322acf30d531cefe3cdbf7287ec8cde601bcd
Author: Ben Pfaff <address@hidden>
Date:   Sat Mar 19 14:40:11 2011 -0700

    identifier: Rename token_type_to_string() and make a new version.

commit 9d1d71e732eeed85ca3002b264e1269cdd005a3f
Author: Ben Pfaff <address@hidden>
Date:   Sun Feb 13 10:43:57 2011 -0800

    i18n: New functions for truncating strings in an arbitrary encoding.

commit f5099c58d17e8f66a74a84918e688ef17936d392
Author: Ben Pfaff <address@hidden>
Date:   Sat Feb 12 16:37:10 2011 -0800

    i18n: New function recode_string_len().

commit 6d89701ab597b810da249ff0e4e42423e869df66
Author: Ben Pfaff <address@hidden>
Date:   Sat Dec 11 20:58:32 2010 -0800

    i18n: New function uc_name().

commit 9bbbfbc94aead4518e17eb6304451f6ad2ca2db2
Author: Ben Pfaff <address@hidden>
Date:   Mon Dec 6 20:50:04 2010 -0800

    hash-functions: New function hash_case_bytes().
    
    This is useful for hashing an arbitrary byte sequence case-insensitively.
    Obviously most uses would be better off working with Unicode but we aren't
    there yet.

commit 530906aaa19f6c209ca008c8187f7f750a0b1283
Author: Ben Pfaff <address@hidden>
Date:   Wed Mar 9 22:21:11 2011 -0800

    str: New functions for checking for and removing string suffixes.

commit 086322fd8c85a303ba6f552950d6f057f2867add
Author: Ben Pfaff <address@hidden>
Date:   Wed Mar 9 22:10:48 2011 -0800

    str: Rename ss_chomp() to ss_chomp_byte(), ds_chomp() to ds_chomp_byte().
    
    This paves the way for new functions that chomp an entire substring.

commit 687c1acdbeecd7d0d7fdc4143d444e8b1563b532
Author: Ben Pfaff <address@hidden>
Date:   Mon Dec 6 20:46:56 2010 -0800

    str: New function ss_realloc().

commit 417bac514fb3de900cb12689d8668d4d30a82e3f
Author: Ben Pfaff <address@hidden>
Date:   Mon Dec 6 20:54:40 2010 -0800

    output: New function text_item_create_nocopy().

commit d8fdf0b4fa919e48397b438e9453d6b82215ff51
Author: Ben Pfaff <address@hidden>
Date:   Sat Feb 5 21:10:10 2011 -0800

    sys-file-reader: Refactor to clean up character encoding support.
    
    The system file format is unusual in that it does not record the encoding
    used by character strings at the beginning or at any fixed place in the
    file.  Instead, it can be recorded practically anywhere in the file.  It
    never precedes all of the actual character strings in the file, which makes
    it impossible to interpret those strings completely and correctly until it
    is encountered.
    
    Until now, the system file reader has dealt with this situation by
    stuffing uninterpreted character strings into data structures until the
    encoding is known, then at that point fetching out the character strings,
    reencoding them, and stuffing them back into the data structures.  This
    does work, but it has the disadvantage that all of the PSPP data
    structures have to tolerate character strings with unknown encoding.  In
    some cases this seems like an ugly situation.  For example, arbitrary
    variable names have to be supported, even though the syntax for variable
    names is circumscribed by the language, because the syntax rules for
    variable names cannot be completely and correctly applied to a string that
    is in an unknown encoding.
    
    This commit fixes that problem by adopting a new way to read system files.
    Each record in the system file dictionary is essentially slurped into
    memory as a chunk, then the character encoding is extracted from it, then
    the rest of the dictionary is interpreted based on that encoding.  The
    actual implementation is a little more intricate because the format of
    system file records is somewhat non-uniform.

commit ca0a72e321421d02a1fd6df943425eff4bd1a257
Author: Ben Pfaff <address@hidden>
Date:   Wed Mar 16 21:33:54 2011 -0700

    file-name: Do not make output files line-buffered in fn_open().
    
    I don't see any reason to do this.  I can't see anything in the commit
    log for this file or in OChangeLog that explains why it was done.

commit 510366c9d99de028f0322e3df01bc813ec60099b
Author: Ben Pfaff <address@hidden>
Date:   Mon Mar 14 18:19:23 2011 -0700

    data-reader: Remove unreachable "return" statements.

-----------------------------------------------------------------------

Summary of changes:
 NEWS                                               |   46 +-
 Smake                                              |    8 +-
 doc/dev/concepts.texi                              |   25 +-
 doc/flow-control.texi                              |   33 +-
 doc/invoking.texi                                  |   20 +-
 doc/language.texi                                  |   96 +-
 doc/utilities.texi                                 |   69 +-
 perl-module/PSPP.xs                                |   14 +-
 perl-module/t/Pspp.t                               |    4 +-
 src/data/automake.mk                               |    1 +
 src/data/dictionary.c                              |  174 +-
 src/data/dictionary.h                              |   14 +-
 src/data/file-handle-def.c                         |    8 +-
 src/data/file-name.c                               |   11 +-
 src/data/gnumeric-reader.h                         |    8 +-
 src/data/identifier.c                              |  124 +-
 src/data/identifier.h                              |   12 +-
 src/data/identifier2.c                             |  133 ++
 src/data/mrset.c                                   |   33 +-
 src/data/mrset.h                                   |    7 +-
 src/data/por-file-reader.c                         |   25 +-
 src/data/por-file-writer.c                         |    5 +-
 src/data/procedure.c                               |   19 +
 src/data/procedure.h                               |    4 +-
 src/data/sys-file-reader.c                         | 2084 +++++++++++---------
 src/data/sys-file-reader.h                         |    2 +-
 src/data/sys-file-writer.c                         |   20 +-
 src/data/variable.c                                |  139 +-
 src/data/variable.h                                |    7 +-
 src/data/vector.c                                  |   15 +-
 src/data/vector.h                                  |    4 +-
 src/language/automake.mk                           |    6 -
 src/language/command.c                             |  158 +-
 src/language/command.def                           |   11 +-
 src/language/control/automake.mk                   |    3 +-
 src/language/control/do-if.c                       |   12 +-
 src/language/control/loop.c                        |    4 +-
 src/language/control/repeat.c                      |  714 +++-----
 src/language/control/temporary.c                   |    6 +-
 src/language/data-io/combine-files.c               |   21 +-
 src/language/data-io/data-list.c                   |    5 +-
 src/language/data-io/data-parser.c                 |    9 +-
 src/language/data-io/data-reader.c                 |   58 +-
 src/language/data-io/file-handle.q                 |   29 +-
 src/language/data-io/get-data.c                    |    8 +-
 src/language/data-io/inpt-pgm.c                    |   30 +-
 src/language/data-io/save-translate.c              |    2 +
 src/language/data-io/trim.c                        |    5 +-
 src/language/dictionary/apply-dictionary.c         |   11 +-
 src/language/dictionary/attributes.c               |   69 +-
 src/language/dictionary/missing-values.c           |   44 +-
 src/language/dictionary/modify-variables.c         |    4 +-
 src/language/dictionary/mrsets.c                   |   27 +-
 src/language/dictionary/numeric.c                  |   12 +-
 src/language/dictionary/rename-variables.c         |    3 +-
 src/language/dictionary/split-file.c               |    4 +-
 src/language/dictionary/sys-file-info.c            |   18 +-
 src/language/dictionary/value-labels.c             |   22 +-
 src/language/dictionary/variable-label.c           |   17 +-
 src/language/dictionary/vector.c                   |   13 +-
 src/language/dictionary/weight.c                   |    4 +-
 src/language/expressions/parse.c                   |   34 +-
 src/language/expressions/private.h                 |    5 +-
 src/language/lexer/automake.mk                     |    8 +
 src/language/lexer/include-path.c                  |   89 +
 .../temp-file.h => language/lexer/include-path.h}  |   16 +-
 src/language/lexer/lexer.c                         | 2143 +++++++++++---------
 src/language/lexer/lexer.h                         |  158 ++-
 src/language/lexer/q2c.c                           |    6 +-
 src/language/lexer/scan.c                          |  596 ++++++
 src/language/lexer/scan.h                          |   93 +
 src/language/lexer/segment.c                       | 1631 +++++++++++++++
 src/language/lexer/segment.h                       |  122 ++
 src/language/lexer/token.c                         |  173 ++
 src/language/{stats/friedman.h => lexer/token.h}   |   40 +-
 src/language/lexer/value-parser.c                  |    2 +-
 src/language/lexer/variable-parser.c               |   17 +-
 src/language/lexer/variable-parser.h               |   10 +-
 src/language/prompt.c                              |   75 -
 src/language/stats/aggregate.c                     |   23 +-
 src/language/stats/autorecode.c                    |    3 +-
 src/language/stats/descriptives.c                  |   23 +-
 src/language/stats/flip.c                          |    8 +-
 src/language/stats/frequencies.q                   |   25 +-
 src/language/stats/npar.c                          |   77 +-
 src/language/stats/rank.q                          |   12 +-
 src/language/stats/sort-cases.c                    |    2 +-
 src/language/syntax-file.c                         |  144 --
 src/language/syntax-file.h                         |   25 -
 src/language/syntax-string-source.c                |  151 --
 src/language/syntax-string-source.h                |   33 -
 src/language/tests/format-guesser-test.c           |    2 +-
 src/language/tests/moments-test.c                  |    2 +-
 src/language/tests/paper-size.c                    |    2 +-
 src/language/utilities/cache.c                     |    4 +-
 src/language/utilities/cd.c                        |    8 +-
 src/language/utilities/date.c                      |    4 +-
 src/language/utilities/host.c                      |   14 +-
 src/language/utilities/include.c                   |  198 +-
 src/language/utilities/permissions.c               |   13 +-
 src/language/utilities/set.q                       |   13 +-
 src/language/utilities/title.c                     |   92 +-
 src/language/xforms/compute.c                      |    6 +-
 src/language/xforms/count.c                        |   26 +-
 src/language/xforms/fail.c                         |    8 +-
 src/language/xforms/recode.c                       |   45 +-
 src/language/xforms/sample.c                       |    4 +-
 src/language/xforms/select-if.c                    |    2 +-
 src/libpspp/automake.mk                            |   10 +-
 src/libpspp/encoding-guesser.c                     |  289 +++
 src/libpspp/encoding-guesser.h                     |  126 ++
 src/libpspp/getl.c                                 |  271 ---
 src/libpspp/getl.h                                 |  113 -
 src/libpspp/hash-functions.c                       |   18 +-
 src/libpspp/hash-functions.h                       |    1 +
 src/libpspp/i18n.c                                 |  345 ++++-
 src/libpspp/i18n.h                                 |   87 +-
 src/libpspp/message.c                              |  118 +-
 src/libpspp/message.h                              |   23 +-
 src/libpspp/msg-locator.c                          |   87 -
 src/libpspp/msg-locator.h                          |   34 -
 .../utilities/cache.c => libpspp/prompt.c}         |   32 +-
 src/{language => libpspp}/prompt.h                 |   17 +-
 src/libpspp/str.c                                  |   54 +-
 src/libpspp/str.h                                  |   11 +-
 src/libpspp/u8-istream.c                           |  475 +++++
 src/libpspp/u8-istream.h                           |   45 +
 src/output/driver.c                                |   56 +-
 src/output/text-item.c                             |   12 +-
 src/output/text-item.h                             |    3 +-
 src/ui/gui/automake.mk                             |    2 -
 src/ui/gui/comments-dialog.c                       |   13 +-
 src/ui/gui/executor.c                              |   19 +-
 src/ui/gui/executor.h                              |    4 +-
 src/ui/gui/main.c                                  |   39 +-
 src/ui/gui/psppire-data-window.c                   |   16 +-
 src/ui/gui/psppire-dict.c                          |    9 +-
 src/ui/gui/psppire-syntax-window.c                 |   16 +-
 src/ui/gui/psppire-syntax-window.h                 |    1 -
 src/ui/gui/psppire-var-store.c                     |    7 +-
 src/ui/gui/psppire.c                               |   34 +-
 src/ui/gui/psppire.h                               |    6 +-
 src/ui/gui/syntax-editor-source.c                  |  130 --
 src/ui/gui/syntax-editor-source.h                  |   34 -
 src/ui/gui/text-data-import-dialog.c               |    4 +-
 src/ui/source-init-opts.c                          |   20 +-
 src/ui/source-init-opts.h                          |    4 +-
 src/ui/terminal/automake.mk                        |   13 +-
 src/ui/terminal/main.c                             |  113 +-
 src/ui/terminal/msg-ui.c                           |   41 -
 src/ui/terminal/msg-ui.h                           |   29 -
 src/ui/terminal/read-line.h                        |   31 -
 src/ui/terminal/terminal-opts.c                    |   58 +-
 src/ui/terminal/terminal-opts.h                    |    8 +-
 src/ui/terminal/{read-line.c => terminal-reader.c} |  310 ++--
 .../repeat.h => ui/terminal/terminal-reader.h}     |   11 +-
 tests/automake.mk                                  |   43 +
 tests/data/data-in.at                              |   64 +-
 tests/data/sys-file-reader.at                      |  295 +--
 tests/dissect-sysfile.c                            |    8 +-
 tests/language/control/do-repeat.at                |  101 +-
 tests/language/data-io/data-list.at                |    6 +-
 tests/language/data-io/get.at                      |    2 -
 tests/language/data-io/inpt-pgm.at                 |    6 +-
 tests/language/data-io/print.at                    |    2 +-
 tests/language/dictionary/missing-values.at        |    2 +-
 tests/language/dictionary/sys-file-info.at         |    4 +-
 tests/language/expressions/evaluate.at             |    9 +-
 tests/language/expressions/parse.at                |    2 +-
 tests/language/lexer/lexer.at                      |   43 +
 tests/language/lexer/q2c.at                        |    2 +-
 tests/language/lexer/scan-test.c                   |  217 ++
 tests/language/lexer/scan.at                       |  818 ++++++++
 tests/language/lexer/segment-test.c                |  318 +++
 tests/language/lexer/segment.at                    | 1070 ++++++++++
 tests/language/stats/aggregate.at                  |    4 +-
 tests/language/stats/rank.at                       |    6 +-
 tests/language/utilities/insert.at                 |   29 +-
 tests/libpspp/encoding-guesser-test.c              |  102 +
 tests/libpspp/encoding-guesser.at                  |  143 ++
 tests/libpspp/i18n-test.c                          |   68 +-
 tests/libpspp/i18n.at                              |  125 +-
 tests/libpspp/u8-istream-test.c                    |  126 ++
 tests/libpspp/u8-istream.at                        |  142 ++
 184 files changed, 12208 insertions(+), 5428 deletions(-)
 create mode 100644 src/data/identifier2.c
 create mode 100644 src/language/lexer/include-path.c
 copy src/{libpspp/temp-file.h => language/lexer/include-path.h} (71%)
 create mode 100644 src/language/lexer/scan.c
 create mode 100644 src/language/lexer/scan.h
 create mode 100644 src/language/lexer/segment.c
 create mode 100644 src/language/lexer/segment.h
 create mode 100644 src/language/lexer/token.c
 copy src/language/{stats/friedman.h => lexer/token.h} (51%)
 delete mode 100644 src/language/prompt.c
 delete mode 100644 src/language/syntax-file.c
 delete mode 100644 src/language/syntax-file.h
 delete mode 100644 src/language/syntax-string-source.c
 delete mode 100644 src/language/syntax-string-source.h
 create mode 100644 src/libpspp/encoding-guesser.c
 create mode 100644 src/libpspp/encoding-guesser.h
 delete mode 100644 src/libpspp/getl.c
 delete mode 100644 src/libpspp/getl.h
 delete mode 100644 src/libpspp/msg-locator.c
 delete mode 100644 src/libpspp/msg-locator.h
 copy src/{language/utilities/cache.c => libpspp/prompt.c} (63%)
 rename src/{language => libpspp}/prompt.h (69%)
 create mode 100644 src/libpspp/u8-istream.c
 create mode 100644 src/libpspp/u8-istream.h
 delete mode 100644 src/ui/gui/syntax-editor-source.c
 delete mode 100644 src/ui/gui/syntax-editor-source.h
 delete mode 100644 src/ui/terminal/msg-ui.c
 delete mode 100644 src/ui/terminal/msg-ui.h
 delete mode 100644 src/ui/terminal/read-line.h
 rename src/ui/terminal/{read-line.c => terminal-reader.c} (53%)
 rename src/{language/control/repeat.h => ui/terminal/terminal-reader.h} (77%)
 create mode 100644 tests/language/lexer/scan-test.c
 create mode 100644 tests/language/lexer/scan.at
 create mode 100644 tests/language/lexer/segment-test.c
 create mode 100644 tests/language/lexer/segment.at
 create mode 100644 tests/libpspp/encoding-guesser-test.c
 create mode 100644 tests/libpspp/encoding-guesser.at
 create mode 100644 tests/libpspp/u8-istream-test.c
 create mode 100644 tests/libpspp/u8-istream.at


hooks/post-receive
-- 
GNU PSPP



reply via email to

[Prev in Thread] Current Thread [Next in Thread]