bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] dfa: optimize UTF-8 period


From: Paolo Bonzini
Subject: Re: [PATCH v2] dfa: optimize UTF-8 period
Date: Tue, 20 Apr 2010 11:12:10 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3

On 04/20/2010 12:47 AM, Eric Blake wrote:
On 04/19/2010 06:14 AM, Paolo Bonzini wrote:
+  /* A valid UTF-8 character is
+
+          ([0x00-0x7f]
+           |[0xc2-0xdf][0x80-0xbf]
+           |[0xe0-0xef[0x80-0xbf][0x80-0xbf]
+           |[0xf0-f7][0x80-0xbf][0x80-0xbf][0x80-0xbf])

Yes, but in POSIX XBD 9.3.4,
http://www.opengroup.org/onlinepubs/9699919799/toc.htm, the ANYCHAR does
not match NUL.  Do you need to adjust this patch to exclude 0x00?

Yes (following the syntax bits).

Does this seem okay?

Paolo

Attachment: ff.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]