coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "expr" won't match empty strings


From: Pádraig Brady
Subject: Re: "expr" won't match empty strings
Date: Sat, 02 Aug 2014 11:40:30 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 08/02/2014 06:03 AM, Luke Kendall wrote:
> I'm hesitant to report this, but I think it's an actual bug in expr that's 
> been there from day one.
> 
> I believe that expr, when used to match regular expressions, should use the 
> success/failure of the pattern match to determine the exit code.
> 
> But instead, I believe "expr" uses the length of the matched string to 
> determine its exit code.  So when the regexp correctly matches an empty 
> string, expr returns failure, despite the match.  Here's a simple example:
> 
> $ expr " " : "^ *$" && echo Matched.
> 1
> Matched.
> $ expr "" : "^ *$" && echo Matched.
> 0
> 
> And compare that to what sed and grep do:
> 
> 
> $ echo "" | sed -n 's/^,*$/& - yep/p'
>  - yep
> 
> $ printf "a\n\n" | grep '^$' && echo "A match."
> 
> A match.
> 
> I'd like to suggest that expr be changed to use the success/fail of the 
> pattern match to determine the exit status, as all the other unix tools do.
> 
> I don't think this alteration of semantics would break many existing scripts, 
> for two reasons:
> 1) It must be unusual to use regexps that can match an empty string, because 
> expr does not report a match for that corner case, so to correctly handle it, 
> the user must have had to add an explicit test for the input string being 
> empty: and this will still work (it's just that with the suggested change, 
> that extra code becomes redundant).
> 2) Based on my own experience, it's unusual to use expr ":" with patterns 
> that can match the empty string - it's taken me over 30 years to notice this 
> oddity!
> 
> If you think this would be a good change, but don't have time to do anything, 
> let me know and I'll have a go and submit a patch.

The exit status of expr is a common gotcha:

$ expr 2 - 1; echo $?
1
0

$ expr 2 - 2; echo $?
0
1


$ expr ' ' : '^ *$'; echo $?
1
0

$ expr '' : '^ *$'; echo $?
0
1

POSIX states that exit status of 1 is used if "the expression evaluates to null 
or zero".

In this case even though it is a match, the expression does evaluate to zero,
which is awkward, though conformant to POSIX (and solaris and FreeBSD FWIW).

Though I'm not sure we can change that, which would essentially
be changing the handling of the '*' in the expression. Consider:

  printf '%s\n' 1 2 '' 3 |
  while read line; do
    expr "$line" : '^[0-9]*$' >/dev/null || break # at first blank line
    echo process "$line"
  done

BTW, using a leading ^ in the expression is redundant and non portable

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]