auctex-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[FR] Support for spell checking of citation macros


From: Gustavo Barros
Subject: [FR] Support for spell checking of citation macros
Date: Tue, 19 Nov 2019 16:49:00 -0300
User-agent: mu4e 1.2.0; emacs 26.3

Hi all,

This is a feature request for adding support for spell checking of citation macros in AUCTeX. That is, to skip citation macros bibkeys while spell-checking.

The case for the request: It is common for the citation macro bibkeys to represent a large share of false positive hits of spell-checking while in the body of the document. If you have some sort of skipping structure for regular macros (as ispell and flyspell do), it may well be that these false negatives resulting from bibkeys are a sizable part of overall hits.

When this is the case, two things follow:

- When most of the spell-checking session is comprised of skipping numerous false negatives (of which bibkeys are a part but, as argued, a significant one) we may run through legitimate hits, having either to correct this on the spot, or missing it altogether. - We may fat-finger one of these false positives bibkeys and "correct" it. In this case, we will typically find out about it in a compilation warning, at which point the document is already saved, and the mistake is behind a large chain of undo events, generated by the spell-checking session itself. In the best case, we may recognize/remember which bibentry was originally meant by the faulty citation. In the worst, we may be up to a trip back to the sources of the citation in a number of potencial entries.

Of course, I've noticed the (relatively) recent effort done in 'tex-ispell.el', which is very nice, so I know the general problem is considered relevant by the AUCTeX developers. And so I'm kindly suggesting such efforts be extended to the case of citation macros too, for the stated reasons.


This is then the request. But I've been also trying to come up with something in this regard, and thus share with you some considerations and results of these attempts, in the hope they may be useful, in case the request is taken up.


Citation macros are not quite typical, which complicates things for the given task. And this for two reasons:

- The order of one such macro is normally "macro - optional arguments - mandatory arguments", and as they are built, we want to skip the mandatory arguments (the bibkeys) but we write stuff in pre and postnotes, which we want to spell check. - Biblatex's qualified citation lists take an arbitrary number of arguments (sequences of pre/postnotes plus bibkeys).

As far as I can see, these two things imply we need some kind backward looking behavior when considering whether a word should be skipped or not. Or at least this is much easily achieved by looking back somehow.

As it turns out, flyspell can handle this requirement gracefully. I came up with the following first attempt for it:

#+begin_src emacs-lisp
(add-hook 'LaTeX-mode-hook
         (lambda () (setq flyspell-generic-check-word-predicate
                          #'my/LaTeX-mode-flyspell-verify)))

(defun my/LaTeX-mode-flyspell-verify ()
 "Return nil in point in any region we want flyspell to ignore."
 (not (my/tex-flyspell-bibkey-p)))

(defun my/tex-flyspell-bibkey-p ()
 "Returns non-nil if on a citation macro bibkey, nil otherwise."
 ;; The ways to identify that come from 'reftex-view-crossref'.
(let ((macro (car (reftex-what-macro 1 (- (point) (* 10 fill-column)))))
       (key (reftex-this-word "^{}%\n\r, \t"))
       (files (reftex-get-bibfile-list)))
   (and
    (stringp macro)
(string-match "\\`\\\\cite\\|cite\\([s*]\\|texts?\\)?\\'\\|bibentry" macro)
    (ignore-errors
      (reftex-pop-to-bibtex-entry key files t nil nil t)))))
#+end_src

This tries to identify whether we are in a bibkey or not based on `reftex-view-crossref', and should work quite reliably, as far as my understanding and testing go. Some comments about it:

- This relies on the functions `reftex-what-macro', `reftex-get-bibfile-list' and `reftex-pop-to-bibtex-entry' and thus requires RefTeX. I'm not sure if this could be restrictive as a general solution from the point of view of AUCTeX (or flyspell). - The final condition, which checks if we have a bibkey at hand or not is performed by `reftex-pop-to-bibtex-entry' which is a costly operation to hang upon flyspell. In my tests here, I haven't noticed any lag, but I don't use a single large bib database, as I know many people do. I suppose this check may become prohibitive in this case (test pending).

Given the second issue, I've tried to come up with a less costly local check for the same purpose:

#+begin_src emacs-lisp
(defun my/tex-flyspell-bibkey-p-2 ()
 "Returns non-nil if on a citation macro bibkey, nil otherwise."
;; Here we check if in a citation macro as before and, that given, we just ;; check if we are inside braces. As these are not allowed in bibkeys, it
 ;; seems a reasonably safe, albeit simpler method.  See
 ;; https://tex.stackexchange.com/a/408548 and
 ;; https://tex.stackexchange.com/a/96918
;; Brackets *are* allowed though, but atypical. If used, this check will
 ;; fail.
;; When https://lists.gnu.org/archive/html/bug-auctex/2019-11/msg00007.html
 ;; gets solved, we can use the same checking technique.
(let ((macro (car (reftex-what-macro 1 (- (point) (* 10 fill-column))))))
   (and
    (stringp macro)
(string-match "\\`\\\\cite\\|cite\\([s*]\\|texts?\\)?\\'\\|bibentry" macro)
    (save-excursion
      ;; pieces from 'reftex-what-macro'
      (condition-case nil
          (let ((forward-sexp-function nil))
            (up-list -1) t)
        (error nil))
      (= (following-char) ?\{)))))
#+end_src

This should be lighter than the previous one, and gives reasonable results. Though not perfect, as I could conceive at least one legitimate case where it fails (in the function's comments). I could not come up with something better so far, but I'm sure someone with a better parse-fu than myself could manage. Anyway, it seems quite feasible to use something along these lines for the purpose.

The case of ispell is much more complicated, as far as I can tell. Ispell is very much forward looking (strictly so, if I grasped well) and the skipping facilities it provides are based on a regexp plus something which allows us to specify only the *end* point of the skipped region. If I understand it correctly, the function `ispell-region' uses in its loop as the core skipping step:

#+begin_src emacs-lisp
(re-search-forward (ispell-begin-skip-region-regexp) ispell-region-end t)
#+end_src

Where `ispell-begin-skip-region-regexp' is populated by 'ispell-tex-skip-alists' and the general 'ispell-skip-region-alist'.

This complicates both of the requirements initially stated for citation macros: that we cannot simply check the "tail" of the macro, because we need to check the optional pre and postnotes, and that biblatex qualified citation lists may take an arbitrary number of arguments.

That given, I tried to devise a workaround based on the citation keys directly:

#+begin_src emacs-lisp
(add-hook 'LaTeX-mode-hook #'my/tex-ispell-skip-all-citation-keys)
(add-hook 'TeX-after-compilation-finished-functions
         #'my/tex-ispell-skip-all-citation-keys)

(defvar my/ispell-skip-region-alist-orig nil
 "Variable to store ispell-skip-region-alist.")
(make-variable-buffer-local 'my/ispell-skip-region-alist-orig)

(defun my/tex-ispell-skip-all-citation-keys (&optional _file)
 "Populate 'ispell-skip-region-alist' with the list of bibkeys
used in the document."
 (when (not my/ispell-skip-region-alist-orig)
   (setq my/ispell-skip-region-alist-orig
         ispell-skip-region-alist))
 (setq-local ispell-skip-region-alist
             (append my/ispell-skip-region-alist-orig
                     (mapcar (lambda (x)
                               (list (regexp-quote x)))
;; 'reftex-all-used-citation-keys' may be used
                             ;; here instead, when fixed, see
;; https://lists.gnu.org/archive/html/bug-auctex/2019-11/msg00004.html
                             ;;
                             ;; Sort in reverse length, to avoid ispell
;; skipping a partial key, in case one key is
                             ;; a subset of another.
                             (sort (copy-sequence
                                    (my/tex-ispell-all-used-citation-keys))
                                   (lambda (a b)
                                     (> (length a) (length b))))))))

(defun my/tex-ispell-all-used-citation-keys ()
"Generate a list of all used entrykeys, based on biblatex's .bbl file."
 ;; Only works for biblatex, if using bibtex, use
 ;; 'reftex-all-used-citation-keys' instead.
 (let ((bblfile (when (derived-mode-p 'latex-mode)
                  (TeX-master-file "bbl")))
       (keylist))
   (when (and (stringp bblfile)
              (file-exists-p bblfile))
     (with-temp-buffer
       (insert-file-contents bblfile)
       (goto-char (point-min))
       (while (not (eobp))
         (when (re-search-forward "^[ \t]*\\\\entry{\\([^}]*\\)}"
                                  (line-end-position) t)
           (push (match-string-no-properties 1) keylist))
         (forward-line 1))))
   keylist))
#+end_src

While this works, some comments:

- This is an ugly hack...
- It may result in a false negative, if you lay a bibkey somewhere not in a citation macro. - Ugly hack it is, but it is the only way I could conceive to get this somehow working for ispell. While I can live with it in my init file, it certainly does not live up to being a proper "solution".

So, as far as I can tell (I may be wrong, of course), no proper solution to this is feasible without some sort of generalization of the skipping mechanism in `ispell-region'.

These are my findings and results.

Whether this is taken up somewhere in AUCTeX itself (or not), or brought upstream to ispell and flyspell, is of course up to you.

Either way, I hope this is somehow useful.

Best regards,
Gustavo Barros.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]