[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Wide characters
From: |
Mike Gran |
Subject: |
Re: [PATCH] Wide characters |
Date: |
Tue, 24 Feb 2009 23:39:26 -0800 (PST) |
Hi,
[...]
>Yes. I think the best thing will be to let you experiment in a
>dedicated branch, so we can progressively see things take shape.
Works for me
[...]
>> SCM_DEFINE1 (scm_char_ci_eq_p, "char-ci=?", scm_tc7_rpsubr,
>> (SCM x, SCM y),
>> "Return @code{#t} iff @var{x} is the same character as @var{y}
>> ignoring\n"
>> - "case, else @code{#f}.")
>> + "case, else @code{#f}. Case is computed in the Unicode locale.")
>The phrase "Unicode locale" looks confusing to me. This function is
>locale-independent, right?
It is locale-independent. I've seen the phrase "Unicode Locale" used
to mean that the uppercase and lowercase of letters are those
found in the Unicode Character Database. They don't use any
language's special rules. I could have written something like "the
case transforms are the default Unicode case transforms, and do not
use any language-specific rules."
>> + {
>> + /* C0 controls */
>> + "nul", "soh", "stx", "etx", "eot", "enq", "ack", "bel",
>> + "bs", "ht", "newline", "vt", "np", "cr", "so", "si",
>> + "dle", "dc1", "dc2", "dc3", "dc4", "nak", "syn", "etb",
>> + "can", "em", "sub", "esc", "fs", "gs", "rs", "us",
>> + "del",
>> + /* C1 controls */
>> + "bph", "nbh", "ind", "nel", "ssa", "esa",
>> + "hts", "htj", "vts", "pld", "plu", "ri" , "ss2", "ss3",
>> + "dcs", "pu1", "pu2", "sts", "cch", "mw" , "spa", "epa",
>> + "sos", "sci", "csi", "st", "osc", "pm", "apc"
>> + };
>
>Are the new names standard?
They are. They are from the Unicode standard which descends from the
codes in ECMA-48/1991. Actually a couple of the C0 control codes that
are currently in Guile differ from those standards. (I didn't change
them.) The Unicode and ECMA-48 have "lf" for "newline" and "ff" for
"np".
>> - /* Dirk:FIXME:: This type of character syntax is not R5RS
>> - * compliant. Further, it should be verified that the constant
>> - * does only consist of octal digits. Finally, it should be
>> - * checked whether the resulting fixnum is in the range of
>> - * characters. */
>> + /* FIXME:: This type of character syntax is not R5RS
>> + * compliant. */
>
>I think this comment remains valid, doesn't it?
In the code I sent, I did add checks for the two conditions Dirk
mentioned.
Anyway. I'll keep playing with this as time permits.
-Mike