emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter introduction documentation


From: Philip Kaludercic
Subject: Re: Tree-sitter introduction documentation
Date: Fri, 30 Dec 2022 11:25:25 +0000

Yuan Fu <casouri@gmail.com> writes:

>> On Dec 27, 2022, at 8:44 AM, Philip Kaludercic <philipk@posteo.net> wrote:
>> 
>> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> 
>>>> It doesn't need any project, it is literally two command lines.
>>>> Here's an example:
>>>> 
>>>>  gcc -O2 -I.   -c -o parser.o parser.c
>>>>  gcc  -shared parser.o scanner.o  -ltree-sitter -o 
>>>> libtree-sitter-c-sharp.dll
>>> 
>>> AFAIK `parser.c` is a file generated from the actual grammar's source,
>>> itself written in Javascript.
>>> 
>>> So the above instructions are akin to downloading a precompiled binary
>>> and installing it.  While it is the most convenient path for the
>>> end-users, it's important w.r.t Freedom to make sure that grammars can
>>> also be regenerated from source by the end users.
>> 
>> I have asked the question before, but freedom or not, the above is a
>> nuisance to run for every language.  If the process is as automatic as
>> the above example demonstrates, shouldn't Emacs have a command to take a
>> grammar and compile+install it?  I guess this could be more complicated
>> if the grammar is generated using a custom tool-chain for each language
>> (or is it always Javascript?), but nothing impossible.
>
> Though the magic of programming, such command now exists: 
> treesit-install-language-grammar. It needs recipes to work, though. The 
> recipe would involve https://github.com, which I guess is probably too 
> heretical to include in Emacs source, so I left the recipes empty. I tested 
> the install command with these recipes:
>
> (setq treesit-language-source-alist
>       '((python "https://github.com/tree-sitter/tree-sitter-python.git";)
>         (typescript 
> "https://github.com/tree-sitter/tree-sitter-typescript.git";
>                     "typescript/src" "typescript")))
>
> Yuan

If acceptable, it looks good.  I could imagine that it should be OK if
we point to GitHub, since we are just using it as a Git host.  Here are
a few suggestions

diff --git a/lisp/treesit.el b/lisp/treesit.el
index b120ca68c5..651898e948 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -99,6 +99,15 @@ treesit
   :group 'tools
   :version "29.1")
 
+(defcustom treesit-enabled-modes nil
+  "List of modes to enable tree-sitter support if available.
+When initialising a major mode with potential tree-sitter
+support, this variable is consulted.  The special value t will
+enable tree-sitter support whenever possible."
+  :type '(choice (const :tag "Whenever possible" t)
+                 (repeat :tag "Specific modes" function))
+  :version "29.1")
+
 (defcustom treesit-max-buffer-size
   (let ((mb (* 1024 1024)))
     ;; 40MB for 64-bit systems, 15 for 32-bit.
@@ -2690,20 +2699,19 @@ treesit--install-language-grammar-1
 For LANG, URL, SOURCE-DIR, GRAMMAR-DIR, CC, C++, see
 `treesit-language-source-alist'.  If anything goes wrong, this
 function signals an error."
-  (let* ((lang (symbol-name lang))
-         (default-directory "/tmp")
-         (workdir (expand-file-name "treesit-workdir-00893133134"))
+  (let* ((default-directory (make-temp-file "treesit-workdir" t))
+         (workdir (expand-file-name "repo"))
          (source-dir (expand-file-name (or source-dir "src") workdir))
          (grammar-dir (expand-file-name (or grammar-dir "") workdir))
-         (cc (or cc "cc"))
-         (c++ (or c++ "c++"))
+         (cc (or cc (seq-find #'executable-find '("cc" "gcc" "c99"))
+                 (error "No C compiler found")))
+         (c++ (or c++ (seq-find #'executable-find '("c++" "g++"))))
          (soext (pcase system-type
                   ('darwin "dylib")
                   ((or 'ms-dos 'cywin 'windows-nt) "dll")
                   (_ "so")))
          (out-dir (or (and out-dir (expand-file-name out-dir))
-                      (expand-file-name
-                       "tree-sitter" user-emacs-directory)))
+                      (locate-user-emacs-file "tree-sitter")))
          (lib-name (format "libtree-sitter-%s.%s" lang soext)))
     (unwind-protect
         (with-temp-buffer
@@ -2713,8 +2721,8 @@ treesit--install-language-grammar-1
            "git" nil t nil "clone" url "--depth" "1" "--quiet"
            workdir)
           ;; cp "${grammardir}"/grammar.js "${sourcedir}"
-          (copy-file (concat grammar-dir "/grammar.js")
-                     (concat source-dir "/grammar.js"))
+          (copy-file (file-name-concat grammar-dir "grammar.js")
+                     (file-name-concat source-dir "grammar.js"))
           ;; cd "${sourcedir}"
           (setq default-directory source-dir)
           (message "Compiling library")
@@ -2723,6 +2731,7 @@ treesit--install-language-grammar-1
            cc nil t nil "-fPIC" "-c" "-I." "parser.c")
           ;; cc -fPIC -c -I. scanner.c
           (when (file-exists-p "scanner.c")
+            (unless c++ (error "No C++ compiler found"))
             (treesit--call-process-signal
              cc nil t nil "-fPIC" "-c" "-I." "scanner.c"))
           ;; c++ -fPIC -I. -c scanner.cc
@@ -2739,7 +2748,7 @@ treesit--install-language-grammar-1
                       (rx bos (+ anychar) ".o" eos))
                    "-o" ,lib-name))
           ;; Copy out.
-          (copy-file lib-name (concat out-dir "/") t)
+          (copy-file lib-name (file-name-as-directory out-dir) t)
           (message "Library installed to %s/%s" out-dir lib-name))
       (when (file-exists-p workdir)
         (delete-directory workdir t)))))

reply via email to

[Prev in Thread] Current Thread [Next in Thread]