chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] regex to be actively deprecated some day?


From: Jim Ursetto
Subject: Re: [Chicken-users] regex to be actively deprecated some day?
Date: Wed, 9 Sep 2015 23:29:32 -0500

Well, shoot.  Other fearless Chickeneers already noticed this bug and marked it as invalid in (http://bugs.call-cc.org/ticket/1189#comment:2) due to order of operations.  Basically, in “\\\\1” the backslash is itself backslashed, so it becomes the text \1.

Anyway, the upshot is, the irregex version is better.

Jim

On Sep 9, 2015, at 23:24, Jim Ursetto <address@hidden> wrote:

Matt,

In fact, there might be a bug in the \1 substitution mechanism, so it is not a bad idea to use the irregex-style replacement anyway, even if you are sticking with POSIX REs.  I noticed this a few days ago when attempting to escape characters using a backslash.  On the other hand, I could just be doing it wrong.

This is correct:

#;354> (print (irregex-replace/all "([:@{}>])" "{foo:}" "\\" 1))    
\{foo\:\}                                                           

This is obviously not correct:

#;355> (print (string-substitute "([:@{}>])" "\\\\1" "{foo:}" #t))  
\\1foo\\1\\1          

But, when prefixing the character with something other than backslash, \1 works fine:
                                              
#;356> (print (string-substitute "([:@{}>])" "^\\1" "{foo:}" #t))   
^{foo^:^}                                                           

This is with 4.8.0.6 (although I don’t think it matters) and the latest regex.

Jim

On Sep 7, 2015, at 21:07, Matt Welland <address@hidden> wrote:

Ok, I sort of panicked when I saw what looked like regex being deprecated (read my original message below if you wish). After re-reading the irregex egg wiki page a few times it looks like all is well assuming these two things:

1. irregex unit will continue to support reading the pcre syntax
2. those using the backslash substitution destination string syntax be prepared to write a parser/converter.

As a request to the developers - please consider adding the function from the regex egg that parses the \N type dest strings to irregex.

Thanks.

Matt
-=-
====== my original "panicked"message =====

From a comment to Chicken-janitors regarding bug #1189 I saw this:

"This seems to be an undocumented feature of the substring-replace
 function, which allows you to escape the backslash. I would recommend
 using irregex, the regex egg's API is kind of deprecated anyway, and it's
 also not very efficient."

Then in the regex egg wiki page I see:

"It is a thin wrapper around the functionality provided by irregex and is mostly intended to keep old code working."

These statements leave me a little concerned as I use the regex egg a fair amount and I don't have the energy to learn yet another abstraction or to go back and rewrite old code. More importantly I expose the use of regexes to users of Megatest and logpro and they have no tolerance for doing something considered a "standard" in a different way, especially if it means using something that looks like Scheme.

From re-reading the irregex egg wiki page I think the only thing I rely on that is missing is the \1 substitution mechanism. Is there an alternative syntax? All I see is the following:
(irregex-replace "(.)(.)" "ab" 2 1 "*")
Which would be implemented using a destination of "\2\1*" in string-substitute. Converting an old-style destination string to the list of numbers and strings would not be too hard I suppose.

Thanks,

Matt
-=-


_______________________________________________
Chicken-users mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/chicken-users



reply via email to

[Prev in Thread] Current Thread [Next in Thread]