[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] utf8 and string-ref performance

From: Alan Post
Subject: Re: [Chicken-users] utf8 and string-ref performance
Date: Wed, 24 Nov 2010 09:33:24 -0700

On Wed, Nov 24, 2010 at 05:05:18PM +0100, Peter Bex wrote:
> On Wed, Nov 24, 2010 at 08:37:37AM -0700, Alan Post wrote:
> > gentufa'i works by storing the entire input port in a string, and
> > ceating position objects to refer to the "rest of the string" as I
> > parse.
> > 
> > This means I need to perform the following:
> > 
> > 1) reference a character by index
> > 2) compare a character, string, or regular expression starting
> >    at an index.
> Are you sure you need this?  If I understood the sentence above the
> list correctly, it might be enough to use string->list and then work
> with a list of characters.  This can be done pretty fast, and you
> can store pointers into arbitrary places of the input simply by storing
> the relevant cons cell.

I will play with this, as you're correct I could work with lists
rather that strings.

I'm using irregex for character class matching.  It looks like I should be
using srfi-14/utf8+iset instead.  Do those work only on the character level,
am I missing a string version of those?  I see char-set-contains? for
which I can determine whether a character is in the class, but I
usually want to compare several characters in a row, as in I want to
match the input until something isn't in the character class.

.i ko djuno fi le do sevzi

reply via email to

[Prev in Thread] Current Thread [Next in Thread]