[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[platform-testers] new snapshot available: grep-2.20.72-d512
From: |
Jim Meyering |
Subject: |
[platform-testers] new snapshot available: grep-2.20.72-d512 |
Date: |
Wed, 29 Oct 2014 11:29:59 -0700 |
Thanks to many fixes and improvements by Paul Eggert and Norihiro Tanaka,
here is a pre-release snapshot:
grep snapshot:
http://meyering.net/grep/grep-ss.tar.xz 1.2 MB
http://meyering.net/grep/grep-ss.tar.xz.sig
http://meyering.net/grep/grep-2.20.72-d512.tar.xz
Here is the NEWS so far:
** Improvements
Performance has been greatly improved for searching files containing
holes, on platforms where lseek's SEEK_DATA flag works efficiently.
Performance has improved for rejecting data that cannot match even
the first part of a nontrivial pattern.
Performance has improved for very long strings in patterns.
If a file contains data improperly encoded for the current locale,
and this is discovered before any of the file's contents are output,
grep now treats the file as binary.
grep -P no longer reports an error and exits when given invalid UTF-8 data.
Instead, it considers the data to be non-matching.
** Bug fixes
grep no longer mishandles patterns that contain \w or \W in multibyte
locales.
grep would fail to count newlines internally when operating in non-UTF8
multibyte locales, leading it to print potentially many lines that did
not match. E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .."
would print this:
1:1
2
3
4
5
6
7
8
9
10
implying that the match, "10" was on line 1.
[bug introduced in grep-2.19]
grep in a non-UTF8 multibyte locale could mistakenly match in the middle
of a multibyte character when using a '^'-anchored alternate in a pattern,
leading it to print non-matching lines. [bug present since "the beginning"]
grep -E rejected unmatched ')', instead of treating it like '\)'.
[bug present since "the beginning"]
** Changes in behavior
The GREP_OPTIONS environment variable is now obsolescent, and grep
now warns if it is used. Please use an alias or script instead.
In locales with multibyte character encodings other than UTF-8,
grep -P now reports an error and exits instead of misbehaving.
When searching binary data, grep now may treat non-text bytes as
line terminators. This can boost performance significantly.
grep -z no longer automatically treats the byte '\200' as binary data.
====================================================
Changes in grep since v2.20:
Jim Meyering (13):
maint: post-release administrivia
build: don't redirect directly to $@
build: improve rule to generate egrep+fgrep scripts
maint: generate distributed THANKS from VC'd THANKS.in
doc: update HACKING
maint: split long lines, and enforce the 80-column limit
maint: avoid distcheck failure
tests: add expect-to-fail test for a glibc regexp bug
doc: move NEWS note about GREP_OPTIONS into proper section
maint: suppress a false-positive -Wcast-align warning
grep: avoid stack buffer read-underrun and overrun
tests: make new test script executable
gnulib: update to latest; bootstrap, too
Norihiro Tanaka (13):
dfa: speed-up at initial state
dfa: separate dfaexec function to help optimization by compiler
grep: fix subscript error when testing whether empty lines match
dfa: check end of input buffer after transition in non-UTF8
multibyte locale
dfa: factor out a new nontrivial block of duplicated code
dfa: test for just-fixed bug
dfa: fix a theoretical bug
grep: initialize validation_boundary properly before use
dfa: process all MBCSET constructs via glibc's matcher
dfa: remove two erroneous clauses from a now-unused function
tests: add test for grep -P fix
dfa: avoid false match in a non-UTF8 multibyte locale
dfa: make \w and \W work in multibyte locales
Paul Eggert (46):
build: update gnulib submodule to latest
grep: use system strstr if available and fast
grep: undo part of previous change
doc: use gnulib fdl module
maint: remove grep.spec
build: don't make output files read-only
build: avoid -Wstack-protector
grep: with -E, unmatched ')' matches itself
doc: Document -r vs --exclude more carefully.
doc: prefer @env to @code
doc: document LANGUAGE
grep: fix integer-width bugs in undossify_input etc.
grep: -P now treats invalid UTF-8 input as non-matching
grep: port recent fix to older pcre version
grep: fix false matches with -P '...$' and invalid UTF-8
grep: fix false matches with -P '...$' and invalid UTF-8
doc: bug tracker has moved to debbugs.gnu.org
grep: make GREP_OPTIONS obsolescent
grep: diagnose -P in non-UTF-8 multibyte locale
grep: remove/refactor unnecessary code about line splitting
grep: speed up -P on files containing many multibyte errors
grep: use bool for boolean in grep.c
grep: treat a file as binary if its prefix contains encoding errors
grep: improve performance for older glibc
grep: use mbclen cache more effectively
grep: avoid false alarms for mb_clen and to_uchar
grep: use mbclen cache in one more place
grep: port -P speedup to hosts lacking PCRE_STUDY_JIT_COMPILE
grep: fix -P speedup bug with empty match
grep: refactor binary-vs-unknown-vs-text flags for clarity
grep: -z no longer considers '\200' to be binary data
grep: non-text bytes in binary data may be treated as line ends
grep: minor -P speedup with jit_stack
grep: improve -P performance in typical cases
grep: skip past holes efficiently
grep: port to platforms lacking SEEK_DATA
grep: speed up processing of holes before EOF on Solaris
grep: scan for valid multibyte strings more quickly
grep: don't check extensively for invalid prefix bytes unless -P
maint: generalize the -Wcast-align fix
dfa: minor tweaks, mostly to remove __attribute__ ((noinline))
doc: clarify exit status
doc: modernize and simplify man page
grep: fix off-by-one bug in -P optimization
grep: fix grep -P crash
tests: work around older libpcre bugs when testing -P and UTF-8
Changes in gnulib since v2.20:
* gnulib 98ca2c0...8415b67 (95):
> socketlib, sockets, sys_socket: Use AC_REQUIRE to pacify autoconf.
> iconv: avoid false detection of non-working iconv
> bootstrap: print more diagnostics for missing programs
> bootstrap: only update the gnulib submodule
> symlinkat: port to AIX 7.1
> readlinkat: port to AIX 7.1
> remove spurious {
> modules/fcntl: fix error reporting by dupfd
> basename, dirname: Improve documentation.
> exclude: declare exclude_patopts static
> autoupdate
> dirname: support compilation with C++
> qsort_r: include <config.h>
> avltree-list: avoid compiler warnings
> qsort_r: new module, for GNU-style qsort_r
> strerror_r-posix: support compilation with C++
> fcntl-h: fix compilation with Intel C++ compiler
> autoupdate
> mountlist: use /proc/self/mountinfo when available
> users.txt: add cmogstored
> gnulib-tool: Sync with build-aux/bootstrap options
> gnulib-tool: Fallback to wget when rsync fails
> maintainer-makefile: add syntax check for useless ';;'
> pthread, pthread_sigmask, threadlib: port to Ubuntu 14.04
> error: drop spurious semicolon
> gnulib-common.m4: port to GCC 4.2.1 and Sun Studio 12 C++
> manywarnings: add GCC 4.9 warnings
> vasnprintf: fix bugs in width computation
> vasnprintf: Avoid signed/unsigned comparison warning.
> parse-datetime: Avoid signed/unsigned comparison warning
> qsort_r: new module, for GNU-style qsort_r
> vla: new module
> localename: make gl_locale_name_thread really thread-safe on Windows
> getpass: don't assume struct termios
> getdtablesize: fall back on sysconf (_SC_OPEN_MAX)
> vararrays: modernize AC_C_VARARRAYS for C11
> relocatable-prog-wrapper: port gettext to OS X 10.8 + GCC 4.8.1
> sys_select: fix FD_ZERO problem on Solaris 10
> accept: document Solaris 10 type glitch
> extern-inline: port to FreeBSD, DragonFly
> autoupdate
> Use consistent style to check DEBUG macro in regex_internal.c
> openat-die: use _Noreturn markup
> test-open: port to cygwin, which lacks Fortify
> localename: Enforce declarations before statements.
> test-userspec: don't look up numeric user names
> localcharset, localename: MS-Windows support for non-default locales
> announce-gen: avoid failure when Digest::SHA is installed
> gettext: revert "update macros to version 0.19"
> regex: don't deref NULL upon heap allocation failure
> maint.mk: give projects more flexibilty in set_prog_name arguments
> regex: fix memory leak in compiler
> announce-gen: avoid perl warnings
> localename: avoid -Wsuggest-attribute={const,pure} warnings
> nl_langinfo: Fix last change.
> Define macros for glibc
> Sync up error.c with glibc
> nl_langinfo: fix build under mingw
> mountlist: do not classify a bind-mounted dir entry as "dummy"
> maint.mk: less syntax-check noise when SIGPIPE is ignored
> nl_langinfo: CODESET on MS-Windows and more items from localeconv
> Bruno Haible has stepped down as maintainer.
> mktime: merge #if/#ifdef usage from glibc
> git-version-gen: improve option descriptions
> regex: fix memory leak in compiler
> regex: merge patch from libc
> acl: port to gcc -Wredundant-decls
> parse-duration: eliminate 68-year duration limit
> pthread: don't assume AC_CANONICAL_HOST, port better to Solaris, etc.
> pthread: define thread-safe macros on some platforms
> regex: don't be multithreaded if USE_UNLOCKED_IO.
> gettext: update macros to version 0.19
> select,poll: fix console handle check on windows 8
> select: fix waiting on anonymous pipes on MS-Windows
> times: fix to return non constant value on MS-Windows
> isatty: fix to work on windows 8
> maint: fix typo in fdl.texi
> mountlist: avoid hasmntopt const type warning on solaris
> maintainer-makefile: delete obsolete code
> maintainer-makefile: avoid spurious error messages
> rename: avoid unused-but-set-variable compiler warning
> maint: add ChangeLog entry missing in previous commit
> rename: mark a label as potentially unused
> gnulib-common.m4: Fix typo in _GL_UNUSED_LABEL.
> acl: apply pure attribute to two functions
> gnulib-common.m4: add _GL_UNUSED_LABEL
> dup2, fcntl, fcntl-h: port to AIX 7.1
> printf, config.rpath: Port to FreeBSD 10.
> ftoastr: work around compiler bug in IBM xlc 12.1
> valgrind-tests: fixed misleading help message
> isfinite, isinf, isnan tests: fix for little-endian PowerPC
> exclude-tests: port to AIX 7.1
> pthread_sigmask, timer-time: use gl_THREADLIB only if needed
> gnulib-tool: wget translations using --no-verbose rather than --quiet
> gnulib-tool: adjust translation wget to avoid a https redirection
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [platform-testers] new snapshot available: grep-2.20.72-d512,
Jim Meyering <=