[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: how to scan file for non-ascii chars(egcut-n-paste from ms-word)

From: Drew Adams
Subject: RE: how to scan file for non-ascii chars(egcut-n-paste from ms-word)
Date: Tue, 18 Jan 2011 18:02:15 -0800

> >If you use Icicles, you can also see and search for all
> >sequences of non-ascii chars this way:  C-` [^[:ascii]]+
> >`S-TAB' to see hits, `C-next' to visit them, etc.
> Have not used that package, hear lots of good things about it!
> Question: that line-ending plus-sign -- part of the command-string,
> or some kind of continuation char?  (Oh no, obvious: the "one 
> or more" regexp-char!)

Yes, the + just means one or more.  It applies to the character set [^[:ascii]],
which means any character except (`^') the characters in the character class
[:ascii], which means any non-ascii character.  The latter information
(character classes) is still missing from the Emacs manual, but you will find it
in the Elisp manual, nodes `Regexp Special' and `Char Classes'.

> For those (few? many?) of us who don't know icicles, could you
> maybe how those two command-strings work, ie sort of translate
> each of them into some kind of "emacs-english"?  THANKS!

I guess you're asking about `C-`' and `S-TAB'.  In Icicle mode, `C-`' is bound
to `icicle-search' by default, and `S-TAB' does regexp completion.

C-` [^[:ascii]]+ parses the buffer into search contexts (the regions that match
[^[:ascii]]+), and it reads your input with completion, making those contexts
available as the set of completion candidates.  IOW, you use completion to
choose which search hits to visit.

When you hit S-TAB it completes your minibuffer input (empty so far) against the
candidate search contexts, showing those that match your input in buffer
*Completions*.  Since your input is empty they all match and are all shown.

If you can type non-ascii chars (or paste them into the minibuffer), then doing
that filters the candidates to those that match the sequence of chars you
inserted.  For example, if you type a non-ascii double-quote, then only the
contexts that contain that char are now the candidates.  Change your minibuffer
input and you change the set of matching contexts, which you can visit.

Whatever the current set of matching candidates is, you can visit any of their
locations by cycling among them using `next' (PageDown) and `C-RET' to choose.
Or just visit some in sequence (buffer order, by default) using `C-next'.  Or
just visit some by clicking `C-mouse-2' on them in *Completions*.  `RET' (or
`C-g') ends the tour.

You can also perform replacements of either an entire search context or just the
part(s) that your input matches.  You could, for example, replace all of the
non-ascii double-quote chars by an ascii double-quote.

> >
> >
> >When the set of hits is thus those defined by [^[:ascii]]+, 
> >you can type any string using a subset of those chars (i.e.,
> >one or more particular non-ascii chars) to narrow the hits,
> >then visit any of those, and optionally replace any
> >or all of them with alternatives.
> >
> Hmmm.  Maybe a perl program, with hashes, etc, I should do it
> that way? Seems like overkill, unwiedly too, for something SO

As I also mentioned, and someone else did the same later, you can also use
[^[:ascii]] with incremental regexp search: `C-M-s'.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]