[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33793: sed bug with regular expressions

From: Eric Blake
Subject: bug#33793: sed bug with regular expressions
Date: Tue, 18 Dec 2018 12:23:16 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1

tag 33793 notabug

On 12/18/18 6:50 AM, Uladzimir Panasiuk wrote:
Hi. I've found the bug using sed. There is how to reproduce:
1) Run bash
2) Exec command \
echo weather -5.0 | sed

You used two range expressions in this regex, but the result is the same as if you had used this regex with only one range expression::


Either way, you requested all characters except for the 10 digits, a literal backslash, or a literal dot. Remember, a range expression [\-\] selects a single character of the backslash. Since '-' is not excluded from the [] expression, sed correctly strips it.

3) You will get "5.0". Expected output is "-5.0"

You might be remembering the behavior of perl regex, where \ inside [] is an escape character. But that's not how POSIX regex behaves - inside [], \ is literal, and there are no escape characters.

If you exec
echo weather -5.0 | sed 's/[^0-9\.\-]//g'

Here, your regex only has one range expression, but lists \ twice. The repetition is harmless, but means that your expression is the same as this shorter:


It is not obvious from your input whether you intended to be filtering out literal backslash or not, but if not, you probably meant to write:


with no backslash, and with the - last (as that is one of the few places that you can write - to be matched as itself rather than treated as a range operator between neighboring characters).

I'm closing this as not a bug, but feel free to reply with further questions or comments.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

reply via email to

[Prev in Thread] Current Thread [Next in Thread]