[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
confusion over undocumented syntax-table features, font-lock and syntax-
From: |
Matthew Swift |
Subject: |
confusion over undocumented syntax-table features, font-lock and syntax-tables |
Date: |
Tue, 11 Feb 2003 00:08:20 -0500 |
This bug report will be sent to the Free Software Foundation,
not to your local site managers!
Please write in English, because the Emacs maintainers do not have
translators to read other languages for them.
Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.
In GNU Emacs 21.2.1 (i386-debian-linux-gnu, X toolkit, Xaw3d scroll bars)
of 2002-11-06 on beth, modified by Debian
configured using `configure i386-debian-linux-gnu --prefix=/usr/local
--sharedstatedir=/var/lib --libexecdir=/usr/local/lib --localstatedir=/var/lib
--infodir=/usr/local/share/info --mandir=/usr/local/share/man --with-pop=yes
--with-x=yes --with-x-toolkit=athena --without-gif'
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: nil
locale-coding-system: nil
default-enable-multibyte-characters: t
Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:
I was observing a strange behavior in `sh-mode' defined in sh-script.el where
(re-search-forward "\\s<\\s<") was failing even though it was passing over a
buffer substring of two characters whose syntax classes, as reported by
`(char-syntax (char-after N))' and N+1 was "<".
I have not figured out why that happens, and it may not be a bug, but in my
experiments, I have come across a barrel full of puzzles and questions. I am
reporting as much as I have been able to distinguish.
The results of the following code completely baffles me. Is
global-font-lock-mode changing the syntax classes?
-----cut here
(setq test "
hello () { echo world.; }
## boln is at buffer position 40
")
(defun test ()
(sh-mode)
(message "result is %S"
(if (and
(equal "<" (char-to-string (char-syntax ?#)))
(equal (char-after 40) ?#)
(equal (char-after 41) ?#)
(equal "<" (char-to-string (char-syntax (char-after 40))))
(equal "<" (char-to-string (char-syntax (char-after 41))))
)
(save-excursion
(goto-char (point-min))
(re-search-forward "\\s<\\s<"))
"whoops!")))
(progn
(global-font-lock-mode 0)
;; succeeds
(test))
(progn
(global-font-lock-mode 1)
;; `re-search-forward' fails the SECOND time, if not the first (no
;; pattern found)
(test))
;;(sh-mode)
;;(emacs-lisp-mode)
;;(global-font-lock-mode)
;;(test)
---- end of test file
The facility for matching chars in syntax descriptors is either not fully
documented or has some other problems. Looking into it further would take more
time than I have at the moment.
sh-script.el says:
(defvar sh-mode-syntax-table
'((sh eval sh-mode-syntax-table ()
?\# "<"
?\n ">#"
?\" "\"\""
?\' "\"'"
?\` "\"`"
?! "_"
?% "_"
?: "_"
?. "_"
?^ "_"
?~ "_"
?< "."
?> ".")
(csh eval identity sh)
(rc eval identity sh))
"Syntax-table used in Shell-Script mode. See `sh-feature'.")
Consider the second entry in the table, which is the equivalent of
(modify-syntax-entry ?\n ">#")
The documentation for syntax descriptors says (both in TeXinfo and in
functions' docstrings) that the second character, the matching character, is
"used" only when the syntax class is "(" or ")" (open or close parentheses).
The declaration above assigns a matching character to a character with the
endcomment syntax class. The documentation does not say doing this is an
error. But from here, all possibilities imply one or more problems. (And I
should observe that it seems that, furthermore, several major modes assign
matching characters to chars in the string delimiter (") class (usually the
same one, e.g., " with " and ' with '); this usage is likewise problematic.)
If the declaration of ">#" is equivalent to ">", with respect to all Emacs
primitives and distributed Lisp code, then
+ sh-script.el should use simply ">" for clarity.
It may be desirable to leave in a facility for assigning matching chars to
non-paren classes, so that programmers can do something with it. If so,
brief mention should be made in the TeXinfo documentation, if not the
docstrings. If not, then
+ it should be documented that matching chars are ignored except
for the "(" and ")" classes;
+ `modify-syntax-entry' should decline to install ignored matching chars
by either signalling an error or by silently deleting the matching
char;
+ `describe-syntax' should decline to report matching chars that do not
have any significance, because reporting them is confusing
(`describe-syntax' will report that ?\n matches ?#, and likewise if
you assign matching chars to chars in other syntax classes for which
matching seems irrelevant).
If the declaration of ">#" is not equivalent to ">", then either the behavior
is undefined or it is well-defined but not documented. If it is undefined,
then sh-script.el should not be using it. If it is undocumented, then it
should be documented.
Recent input:
M-x r e p o r t - e m a c s - b u g <return>
Recent messages:
1 <- require: gnus-group
1 -> require: gnus-start
1 <- require: gnus-start
1 -> require: gnus-util
1 <- require: gnus-util
Loading gnus-topic...done
Loading emacsbug...
1 -> require: sendmail
1 <- require: sendmail
Loading emacsbug...done
- confusion over undocumented syntax-table features, font-lock and syntax-tables,
Matthew Swift <=
- confusion over undocumented syntax-table features, font-lock and syntax-tables, Luc Teirlinck, 2003/02/12
- Re: confusion over undocumented syntax-table features, font-lock and syntax-tables, Luc Teirlinck, 2003/02/12
- Re: confusion over undocumented syntax-table features, font-lock and syntax-tables, Luc Teirlinck, 2003/02/13
- Re: confusion over undocumented syntax-table features, font-lock and syntax-tables, Matt Swift, 2003/02/15
- confusion over undocumented syntax-table features, font-lock and syntax-tables, Luc Teirlinck, 2003/02/15