[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nano-devel] searching for certain UTF8 characters also finds ghosts

From: Benno Schulenberg
Subject: Re: [Nano-devel] searching for certain UTF8 characters also finds ghosts
Date: Sat, 21 Mar 2015 21:45:40 +0100

On Sat, Mar 21, 2015, at 20:54, Mark Majeres wrote:
> >> > Strangely this happens only for characters in the range
> >> > U+00A0 to U+00BF, which baffles me.  (I've tested it with
> >> > ?, ?, ?, ?, ? -- they all get doubly found -- but also
> >> > with ?, ?, ?, ?, ?, ?, and ? -- they all are found once.)
> >
> > Hmm.  Your mail is UTF8-encoded, but these characters haven't
> > come across properly.  Did you see them okay?
> They came in the first time just like they look now :)

How strange.  You are sending now via gmail, but maybe you use
a non-utf8 locale on your machine?

> Yes, shifting one multi-byte char to the left is a little bit of work.
> It's not very pretty.

Well, I have a patch for that.  Attached.  I can't notice any
performance improvement anywhere, though.  It could still
be made quite a bit faster by looking at the bytes themselves,
but that requires some thinking and won't look pretty.

> IMO, strstrwrapper() should return the position to the left of
> haystack if the search is not found.  I think the check for
> openfile->current_x == 0 could then be tossed out of your patch.

That's what I tried at first.  But move_mbleft() doesn't like to step
over the left edge of fileptr->data: the assert in that function then


-- - Accessible with your email software
                          or over the web

Attachment: fasterleft.patch
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]