[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090)
From: |
k-ohara5a5a |
Subject: |
Re: lexer.ll: Warn about non-UTF-8 characters (issue 5505090) |
Date: |
Sun, 01 Jan 2012 22:06:52 +0000 |
On 2012/01/01 10:12:27, dak wrote:
> Consider a comment in your case/switch statement that points to
> some reference on the various types of UTF-8 validators.
I don't understand.
Sorry, I wasn't making much sense. As a reader I want to *recognize*
what the but switch/case is doing rather than trying to figure it out.
Maybe :
// Test if these bytes are a UTF-8 encoding of a Unicode character,
// and warn if not. Trap overly-long UTF-8 encodings, but we don't
// need to worry about finer details like some filters do.
because your test is almost but not quite equivalent to the regex in the
back of the Flex manual.
http://codereview.appspot.com/5505090/