Next: , Previous: , Up: Source Code with tree-sitter   [Contents][Index]


2.1.5 Structured Text

The classes generated for tree-sitter use the rules stored in each language’s grammar file to enable implicit source text reproduction at the class level. This makes working with and mutating the AST much simpler. As an example, if an ’if’ statement AST without an ’else’ clause has an ’else’clause added to it, the source text of the AST will reflect that an ’else’clause has been added to it without needing to make any other updates. (Prior to structured text, slots holding connective white-space and punctuation required manual updates to accompany most changes to the content of an AST.)

Each class that is generated can have multiple subclasses which represent the different representations of source text that the base class can take. For example, the update expression in C represents both the pre-increment and post-increment. Two subclasses are generated to disambiguate between the source text representations–one for pre-increment and one for post-increment.

Frequently, these subclass ASTs can be copied with slight modifications to their slot values. This can leave the AST copy in an invalid state for the subclass it had been copied from. When this is detected, the AST’s class will be changed dynamically to the first subclass of the base class which can successfully produce source text with the given slot values. This behavior also applies to objects created with the base class, but it may choose a subclass that’s source text is not the desired representation, so it’s best to specify the exact subclass in case where this matters, such as update expressions in C.

Structured text ASTs contain at least 4 slots which help store information that isn’t implicit to the AST or its parent ast:

The internal-asts slots are generated based on the rule associated with the AST. Any possible place in the rule where two terminal tokens can appear consecutively, an internal-asts slot is placed.

A further ’text’ slot is also used for a subset of ASTs that are known computed-text ASTs. These ASTs hold information that is variable and must be computed and stored when the AST is created. The ASTs that are computed text can be identified by computed-text-node-p.

When creating ASTs, patch-whitespace can be used to insert whitespace in relevant places. This utilizes whitespace-between to determine how much whitespace should be placed in each slot. This currently does not populate inner-asts whitespace.


Next: , Previous: , Up: Source Code with tree-sitter   [Contents][Index]