emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter api


From: Yuan Fu
Subject: Re: Tree-sitter api
Date: Mon, 16 Aug 2021 23:18:19 -0700

> 
> I'm thinking of rules specified via a function that takes a TS node
> (from which the function can explore the rest of the TS tree) and return
> the indentation to use, represented as a pair (POSITION . OFFSET)
> (meaning to indent OFFSET columns further than the column position of
> POSITION).
> 
> The infrastructure would limit itself to making sure we have an uptodate
> tree (computed from a properly widened buffer), find the node
> corresponding to point pass it to the function and then turn the return
> value into an actual column and indent the text accordingly (paying
> attention to the usual difference between when point is "within the
> indentation" vs "within the text”).

Okay, here is the (ad-hoc) infrastructure I came up with:

We have a tree-sitter-simple-indent-function. Major-mode authors can set 
indent-line-function to it to use the simple-indent system. 
tree-sitter-simple-indent-function indents according to 
tree-sitter-simple-indent-rules. Doc string of tree-sitter-simple-indent-rules 
reads:

    A list of indent rule settings.
    Each indent rule setting should be (LANGUAGE . RULES),
    where LANGUAGE is a language symbol, and RULES is a list of
    (MATCHER ANCHOR OFFSET).

    MATCHER determines whether this rule applies, ANCHOR and OFFSET
    together determines which column to indent to.

    A MATCHER is a function that takes three arguments (NODE PARENT
    BOL).  NODE is the largest (highest-in-tree) node starting at
    point.  PARENT is the parent of NODE.  BOL is the point where we
    are indenting: the beginning of line content, the position of the
    first non-whitespace character.

    If MATCHER returns non-nil, meaning the rule matches, Emacs then
    uses ANCHOR to find an anchor, it should be a function that takes
    the same argument (NODE PARENT BOL) and returns a point.

    Finally Emacs computes the column of that point returned by ANCHOR
    and adds OFFSET to it, and indent the line to that column.

    For MATCHER and ANCHOR, Emacs provides some convenient presets.
    See `tree-sitter-simple-indent-presets’.

And doc string for tree-sitter-simple-indent-presets:

    A list of presets.
    These presets can be used as MATHER and ANCHOR in
    `tree-sitter-simple-indent-rules'.

    MATCHER:

    (match NODE-TYPE PARENT-TYPE NODE-FIELD NODE-INDEX-MIN NODE-INDEX-MAX)

        NODE-TYPE checks for node's type, PARENT-TYPE check for
        parent's type, NODE-FIELD checks for the filed name of node
        in the parent, NODE-INDEX-MIN and NODE-INDEX-MAX checks for
        the node's index in the parent.  Therefore, to match the
        first child where parent is \"argument_list\", use (match nil
        \"argument_list\" nil nil 0 0).

    no-node

        Matches the case where node is nil, i.e., there is no node
        that starts at point.  This is the case when indenting an
        empty line.

    (node-at-point TYPE NAMED)

        Check that the node at point -- not the largest node starting at
        point -- has type TYPE.  If NAMED non-nil, check the named node
        at point.

    (parent-is TYPE)

        Check that the parent has type TYPE.

    (node-is TYPE)

        Checks that the node has type TYPE.

    (parent-match PATTERN)

        Checks that the parent matches PATTERN, a query pattern.

    (node-match PATTERN)

        Checks that the node matches PATTERN, a query pattern.

    ANCHOR:

    first-child

        Find the first child of the parent.

    parent

        Find the parent.

    prev-sibling

        Find node's previous sibling.

    no-indent

        Do nothing.

    prev-line

        Find the named node on previous line.  This can be used when
        indenting an empty line: just indent like the previous node.

An example of using these facility can be found in 
ts-c-tree-sitter-indent-rules.

For example, 

    ((match nil "function_definition" "body") parent 0)

means “match the node which it’s parent’s type is “function_definition” and its 
field name is “body”, indent to the start of its parent. That indents the 
starting braces in

int main ()
{
}

    ((parent-is "call_expression") parent 2)

Means “match the node which its’ parent’s type is “call_expression”, and indent 
to the start of its parent + 2. That indents the second line in

my_cool_function
  (arg1, arg2, arg3)

I’ve implemented some indentation rules for C in ts-c-mode as usual. I expect 
someone more knowledgeable in C to actually implement it later.

So… do you think this is ok, or convoluted? In particular, is there a better 
way to implement those “presets”? I don’t want to define them as normal 
functions, because then their name will be super long (parent-is -> 
tree-sitter-simple-indent-parent-is) and annoying to use when writing rules, 
but putting them in an alist (tree-sitter-simple-indent-presets) is a bit 
ad-hoc. I call these presets with tree-sitter--simple-apply, which basically 
looks up tree-sitter-simple-indent-presets, get the function and apply it.

You can find the latest version at https://github.com/casouri/emacs/tree/ts
I.e., git clone https://github.com/casouri/emacs.git --branch ts

Yuan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]