auctex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUCTeX] files with non-ascii characters and LaTeX as in TeXLive 201


From: jfbu
Subject: Re: [AUCTeX] files with non-ascii characters and LaTeX as in TeXLive 2018
Date: Thu, 17 May 2018 10:16:52 +0200

Hi Ikumi

thanks a lot for tackling this,

Le 15 mai 2018 à 08:35, Ikumi Keita <address@hidden> a écrit :

> Hi all,
> 
>> (1) The output of \message{...} also changed in TeXLive 2018.  If
>> non-ascii characters are in \message{...}, they are output in tokenized
>> form by default in TL 2018.  This makes partial typset, such as C-c C-r,
>> C-c C-s and so on, to fail to recognize the right file name when source
>> correlate mode is enabeld, so I inserted another \detokenize{} in
>> `TeX-quote-filename'.
> 
> It turned out that my patch doesn't work as expected in some cases.
> When `TeX-quote-filename' inserts "\string" to escape "#" and "~" before
> \detokenize{} takes place, wrapping the file name inside \detokenize{}
> neutralizes the "\string" so that it appears as literal 7-character
> "\string", not escaping "#" and "~".  In addition, \detokenize{} doubles
> the pound sign "#" for some reason unknown to me.



Perhaps you can use this mouthful

pdflatex  -file-line-error   -interaction=nonstopmode 
"\begingroup\ifdefined\UseRawInputEncoding\UseRawInputEncoding\fi\edef\x{\noexpand\input{\string#éàè\string~\string#.tex}}\expandafter\endgroup\x"

It works to compile file with name #éàè~#.tex, the file containing itself
UTF-8 characters in its contents and is not doing \usepackage[utf8]{inputenc}
to check it does work nevertheless in PDF output (it would not if action of
\UseRawInputEncoding wasn't scoped)

The point is that the filename itself needs be quoted only old-fashioned way
with \string


I am indicating this if there exists context where using "\input" is needed,
because preferred form is not use "\input" at all.

pdflatex \#éàè\\string\~\#.tex

works fine where # only needs to be quoted from shell, not from LaTeX
but ~ needs to be quoted from LaTeX because it is active

I am ignorant in AUCTeX code base so it is a short in the dark)

Regarding your problem with \message{...}. Perhaps
\unexpanded will help you, as this shows (with April 2018 LaTeX)

\documentclass{article}

\begin{document}
\message{éàù}% gives about 72 macros ....

\message{\unexpanded{éàù}}% just displays éàù

\message{\unexpanded{#}}% does not help the doubling... \string# helps
\end{document}



> 
>> (1) According to texdoc etex, \detokenize was added in e-TeX extension.
>> So if the user has quite old TeX distribution where e-TeX extension was
>> not incorporated in the engine (command binary) yet, \detokenize raises
>> error.  I hope this is a permissible incompatibility.
> 
> I realized that the default command binary "tex" for plain-TeX does not
> incorporate e-TeX extension, while the binaries of latex families
> ("latex", "pdflatex", "lualatex", ...) do.  (The e-TeX enabled binary
> producing dvi file for plain-TeX is "etex".)  We only need \detokenize
> for latex family because it is LaTeX kernel that the internal UTF-8-nize
> was performed on and other families including plain-TeX are not
> affected.

On my AUCTeX i have long since modified the "tex" invocation
to be alias for "etex" because I need e-TeX daily even with Plain.

But the "tex" can't be modified officially because it is Knuth tex.

As you said, the problem with active characters for support of UTF-8
is anyway a specific LaTeX one, and only for pdflatex, not lualatex/xelatex.


> 
> In addition, I heard that Omega does not have e-TeX extension, either.
> If this is the case, "lambda" is an exception in latex families and does
> not accept \detokenize.  (I cannot confirm because Omega binaries no
> longer exist in TeX Live these days.)


I can't help either

> 
> Summarizing these considerations, it seems that we have to do something
> like this:
> (defun TeX-quote-filename (file)
>  ... (snip) ...
>  (if (and (eq major-mode 'latex-mode)
>          (not (and (eq TeX-engine 'omega)
>                    ;; lamed, Aleph version of lambda, has e-TeX extension.
>                    (equal LaTeX-Omega-command "lambda"))))
>      (replace-regexp-in-string "[[:multibyte:]]+"
>                               "\\\\detokenize{\\&}" file t)
>    file))
> 


Although \detokenize should not hurt (*) it is not needed
with lualatex/xelatex

(*) except for # problem

- if no \input, then no need for \detokenize

- if \input, possible to use \UseRawInputEncoding as in mouthful above,
but one must test for its existence
to remain compatible with LaTeX earlier than April 2018 (as is done in mouthful)
(I tested above "solution" both with TL2016 and TL2018)

- the above problems are only with LaTeX + 8bit engine 

- I have absolutely no idea about Japanese platex/uplatex


> Some notes.
> (a) Should we treat `doctex-mode' in the same way as `latex-mode' in
> this case?  I'm not sure since I don't know docTeX.

doctex mode is only an auctex mode related to the ltxdoc document class (or 
scrdoc)
which incorporates doc package

as such it is but slight variation of latex so yes it should be handled like
latex-mode (I don't think doctex-mode has special facilities to emulate
docstrip behaviour for extracting files, if it did some other problems
may arise, for example I use doctex-mode sometimes for packages
working with etex, not needing latex)

Best,

Jean-François




reply via email to

[Prev in Thread] Current Thread [Next in Thread]