[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Bug in GNUstep implementation of NSRegularExpression?
From: |
Mathias Bauer |
Subject: |
Bug in GNUstep implementation of NSRegularExpression? |
Date: |
Tue, 08 Apr 2014 16:14:51 +0200 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 |
Hi,
the following simple test program throws an exception:
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[])
{
@autoreleasepool
{
NSString* text = @"h1. Real
Acme\n\n||{noborder}{left}Item||{right}Price||\n|Testproduct|{right}2 x $59.50|\n|
|{right}net amount: $100.00|\n| |{right}total amount: $119.00|\n\n\nh2. Thanks for your
purchase!\n\n\n";
// NSRegularExpression* expr = [NSRegularExpression
regularExpressionWithPattern:@".*?$"
options:NSRegularExpressionAnchorsMatchLines error:NULL];
// int currentIndex = 27;
NSRegularExpression* expr = [NSRegularExpression
regularExpressionWithPattern:@"h[123]\\. "
options:NSRegularExpressionCaseInsensitive error:NULL];
int currentIndex = 33;
[expr firstMatchInString:text options:NSMatchingAnchored
range:NSMakeRange(currentIndex, [text length]-currentIndex-1)];
}
return 0;
}
The call to firstMatchInString will end up in calling uregex_lookingAt
(thus carrying out a regex match) and afterwards calling uregex_start
and uregext_end (thus retrieving the matched text range). The results of
the two latter calls will be used to create an NSRange object in the
prepareResult function of NSRegularExpression.m. And because the length
of this range is negative, an exception is thrown.
Let's have a look at the data:
The matching region starts at position 33, it ends at the string end.
This region has been set at the regex by calling uregex_setRegion (in
the setupRegex function in NSRegularExpression.m).
According to the documentation, uregex_start should return the index in
the input string of the start of the text matched. In my book this
should be the position of the "h2" near the end of the string.
According to the documentation, uregex_end should return the index in
the input string of the position following the end of the text matched.
In my book that should be start + 4.
But I get back: 33 for start and 4 for end. That obviously can't work.
I can't believe that the ICU regex implementation (I'm using ICU4.8 on
Ubuntu 13.10 64Bit) is broken to this extent, so probably the
NSRegularExpression implementation uses it incorrectly. But OTOH I can't
spot an obvious error.
Any hints would be greatly appreciated.
Regards,
Mathias