[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Solaris sed limitations of * and \{M,N\}

From: Ralf Wildenhues
Subject: Solaris sed limitations of * and \{M,N\}
Date: Mon, 21 Sep 2009 23:17:26 +0200
User-agent: Mutt/1.5.20 (2009-08-09)

This originated from <>.

Solaris sed has more limitations.  See the following table, which
denotes how Solaris /bin/sed, Solaris /usr/ucb/sed, and the non-ancient
rest of the sed world interprets quantifiers to subexpressions and
back-references.  Missing entries are "normal" output.

subexpressions:                           normal  /bin/sed /usr/ucb/sed
echo 'xxxy' | sed 's/\(x\)*y/z/g'         z       xxxy     xxxy
echo 'xxxy' | sed 's/\(x\)\{0,1\}y/z/g'   xxz     xxxy
echo 'xxxy' | sed 's/\(x\)\{0,\}y/z/g'    z       xxxy     xxz
echo 'xxxy' | sed 's/\(x\)\{1\}y/z/g'     xxz     xxxy

echo 'xxx1' | sed 's/\(x\)\1*/z/g'        z1
echo 'xxx1' | sed 's/\(x\)\1\{0,1\}/z/g'  zz1     xxx1     zzz1
echo 'xxx1' | sed 's/\(x\)\1\{0,\}/z/g'   z1      xxx1     zzz1
echo 'xxx1' | sed 's/\(x\)\1\{1\}/z/g'    zx1     xxx1     zzz1

Yay, wrt. back-refs it is better than we thought!

I don't think /usr/ucb/sed needs any further mention, nobody should have
that early in their PATH, but we ought to clarify the /bin/sed oddities
IMHO.  (I've tried IRIX, HP-UX, AIX, Tru64, and BSD sed, none of which
had any of the above bugs.)  OK to apply?


   Clarify documentation about Solaris sed quantifier restriction.

   * doc/autoconf.texi (Limitations of Usual Tools) <sed>: '*' does
   not work after subexpressions, \{M,N\} only after one-character
   expressions.  From GCC PR 38923.

diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 9ab866e..e0e4068 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -17993,9 +17994,11 @@ Limitations of Usual Tools
 quite portable to current hosts, but was not supported by some ancient
 @command{sed} implementations like SVR3.
-Some @command{sed} implementations, e.g., Solaris,
-restrict the special role of the asterisk to one-character regular expressions.
-This may lead to unexpected behavior:
+Some @command{sed} implementations, e.g., Solaris, restrict the special
+role of the asterisk @samp{*} to one-character regular expressions and
+back-references, and the special role of interval expressions
address@hidden@address@hidden@}}, @address@hidden@var{m},address@hidden, or 
+to one-character regular expressions.  This may lead to unexpected behavior:
 $ @kbd{echo '1*23*4' | /usr/bin/sed 's/\(.\)*/x/g'}

reply via email to

[Prev in Thread] Current Thread [Next in Thread]