[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h-e-w] Processing chars above \200

From: John J . Xenakis
Subject: Re: [h-e-w] Processing chars above \200
Date: Sun, 23 Sep 2018 15:26:10 -0400

Hi Eli,

>   Then there's still something not right, because you shouldn't be
>   having any of these problems with files that are consistently
>   encoded.

>   It shouldn't and it doesn't.  Depending on what exactly is in your
>   files, something that is still a bit of a mystery for me, Emacs
>   could sometimes err if you don't tell it enough. 

The particular file that triggered the original message was created
over a period of several months.  During that period, text was typed,
and quotes were copied and pasted from various sources.  And who
knows?  Maybe one day I accidentally copied and pasted some errant
problem character.  I assume that's what you're getting at.

Since Microsoft is only supplying updates once a month these days for
Windows 7, that usually means that emacs is kept open for a month.
That means that the problem file, even if it contains an errant
character, still works fine.  But when I reboot the system and reload
the text file, then that's when the problem arises.  I think that's
what happened this time.

Since I could have inserted the errant character at any time in the
previous month, I have no memory of exactly what operation might
have caused the problem.

That's why I keep looking for the right regex that will find such characters
for me.

>   But in any case, there are commands to fix those errors right
>   away, as soon as you realize something like that happens.  We will
>   get to that, once I understand more about the problem.

Could you tell me what those commands are?

>   Is it possible that the file is encoded in UTF-16 or UTF-8?  What
>   happens if you visit the file like this:

>     C-x RET c utf-8 RET C-x C-f FILENAME RET

>   and similarly for utf-16?  Does this fix the problem?

No, that makes no difference.  This is definitely a 7/8-bit
ascii/extended ascii file.

By the way, how do I encode that keyboard string in Lisp?  How does
one use "(universal-coding-system-argument CODING-SYSTEM)" in a macro?

>   And how were those files created in the first place?  I understood
>   from your previous explanations that you created those files by
>   copy-pasting from other applications, is that right?

As I described above.

>   Can you post one such file, please?  It is important that you post
>   a file as a binary attachment, and it is also important to verify
>   that the trick with Notepad and copy/paste works with the file you
>   post.

>   I'm quite sure this is caused by something very simple, because
>   Notepad is certainly not smarter than Emacs wrt encodings.

OK, you can download the following:

The enclosed .txt file causes all the issues that I've described.

I've replaced all the 7-bit letters with "e", because I don't
want to make the text public.

To make is easy for you to find some of the 8-bit characters causing
the problems, I inserted the string ">>>" in front of four lines
containing them.  Just search for that string.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]