help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Is it valid to use the zero-byte "^@" in regexps?


From: Thorsten Jolitz
Subject: Re: Is it valid to use the zero-byte "^@" in regexps?
Date: Wed, 18 Jun 2014 13:16:11 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Nicolas Richard <theonewiththeevillook@yahoo.fr> writes:

> Thorsten Jolitz <tjolitz@gmail.com> writes:
>>> I don't see why it wouldn't be valid, but I don't know. If it is
>>> desirable is another question : it would be better to search for the
>>> beginning, then search for the end with another regexp.
>>
>> That what I did initially, and what is of course much easier, but took
>> twice (?) as long too ...
>
> I'm surprised but I guess I'm being too naive.

most likely not, the speed problem might be unrelated, I have to
double-check again.

>>> Except NUL characters of course.
>>
>> i.e. zero-byte "^@"?
>
> Yes, "NUL" is the name you find in most ASCII charts. "zero-byte" less
> so, afaict.
>
>> But Emacs can differentiate between NUL characters and the @ character -
>
> Of course. One has ascii code 0, the other is 64.
>
> NUL is represented by ^@ because of
> http://en.wikipedia.org/wiki/Caret_notation
>
> If you hit C-f with point before a NUL, you jump over it ; whereas if
> you C-f with point before the two characters ^@ (i.e. not a NUL), cursor
> only jumps over the ^.

yes, thats what I could expect from a well-behaving Emacs ...

>> Often, but not always, the not matched source-blocks contain @
>> characters (but not NUL chars). The strange thing is that the failed
>> matching happens with these blocks being part of a really big
>> testfile. When I isolate and copy them to a temp buffer and try to match
>> them there, it just works.
>
> If you have a reproducible recipe (even with a big file) it would
> certainly help.

After double-checking myy test-file again, it seems that the bug was
sitting iin front of the computer again. Although thatnice library
ert-buffer.el enables me to run buffer tests on rea-wors without
*without* modifying them, I had some left-over dangling 

,-----------
| #+begin_src
`-----------

delimiters in my test file.

I probably called the commands directly (not via ERT), accidentally, and
a few things went wrong and left these dangling delimiters in the
original file. After undoing this, the DIFF's of the ERT test now show
mainly indentation and whitespace differences, which is quite
encouraging.

Conclusion -> NUL chars in regexps do work, if the testfile isn't messed
up. Thx for your input.
 
-- 
cheers,
Thorsten




reply via email to

[Prev in Thread] Current Thread [Next in Thread]