[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] levenshtein.scm assumes 8-bit characters
From: |
felix winkelmann |
Subject: |
Re: [Chicken-users] levenshtein.scm assumes 8-bit characters |
Date: |
Mon, 13 Dec 2004 07:48:17 +0100 |
On Sat, 11 Dec 2004 19:07:19 +0100, Sunnan <address@hidden> wrote:
> see subject.
> e.g. "brön" and "bron" should have a levenshtein distance of 1, if
> using utf-8 encoding.
>
Yes, levenshtein doesn't know anything about extended character
sets. Alex is working on an utf8 library, and Neil van Dyke has written
a portable (IIRC) levenshtein function, so that might be a better
option in the future.
cheers,
felix