help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to translate LaTeX into UTF-8 in Elisp?


From: Marcin Borkowski
Subject: Re: How to translate LaTeX into UTF-8 in Elisp?
Date: Mon, 03 Jul 2017 19:36:40 +0200
User-agent: mu4e 0.9.19; emacs 26.0.50

On 2017-07-03, at 12:24, Emanuel Berg <moasen@zoho.com> wrote:

> Marcin Borkowski wrote:
>
>> It is still a hack, since it relies on the
>> Unicode names being correct.
>
> If it relied on the names being *in*correct,
> that would make it a hack in the
> negative sense.

OK, so here is a proof of concept:

--8<---------------cut here---------------start------------->8---
(defvar TeX-to-Unicode-accents-alist
  '((?` . "grave")
    (?' . "acute")
    (?^ . "circumflex")
    (?\" . "diaeresis")
    (?H . "double acute")
    (?~ . "tilde")
    (?c . "with cedilla")
    (?k . "ogonek")
    (?= . "macron")
    (?. . "with dot above")
    (?u . "with breve")
    (?v . "with caron"))
  "A mapping from TeX control characters to accent names used in
Unicode.")

(defun combine-letter-diacritical-mark (letter mark)
  "Return a Unicode string of LETTER combined with MARK.
MARK can be any character that can be used in TeX accenting
commands."
  (let* ((letter (if (stringp letter)
                     (string-to-char letter)
                   letter))
         (uppercase (= letter
                       (upcase letter))))
    (cdr (assoc-string
          (format "LATIN %s LETTER %c %s"
                  (if uppercase "CAPITAL" "SMALL")
                  letter
                  (cdr (assoc mark TeX-to-Unicode-accents-alist)))
          ucs-names
          t))))
--8<---------------cut here---------------end--------------->8---

As you can see from the mess in `TeX-to-Unicode-accents-alist', this
_is_ a hack.  Still, it seems to work more or less fine.

Best,

-- 
Marcin Borkowski



reply via email to

[Prev in Thread] Current Thread [Next in Thread]