summaryrefslogblamecommitdiffstats
path: root/2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.sbv
blob: fb33267f52ea9083e89a26035243487d63551536 (plain) (tree)
1
2
3
4
5
6
7
8



                                 
                                       

                       
                                 





































































































































                                       
                        



































































































































































                                        
                                      






















                                        
                                        










                                     
            

                       
                                   




                       
                             





































































































































                                        
                                        


































                                        
                              

                       
                                    

                       
                                        










































































































                                        
                                       




                       
                                      
















                                       
                                        





































                                       
                                    








































                                        
                                       














































































































































































































































































































































                                        
                                        



















































































































































































































                                        
                                        




























































































































































































































                                        
                                     




                       
                                 
 
0:00:01.520,0:00:07.200
hello everyone my name is toniang

0:00:04.400,0:00:09.280
I've been using amax for about 10 years

0:00:07.200,0:00:11.519
today I'm going to talk about 360

0:00:09.280,0:00:13.759
a new imax package that allows ems to

0:00:11.519,0:00:17.840
pass multiple programming languages

0:00:13.759,0:00:21.840
in real time

0:00:17.840,0:00:23.359
so what is the problem statement

0:00:21.840,0:00:24.960
in order to support programming

0:00:23.359,0:00:25.760
functionalities for a particular

0:00:24.960,0:00:27.680
language

0:00:25.760,0:00:29.679
a text editor needs to have some degree

0:00:27.680,0:00:31.840
of language understanding

0:00:29.679,0:00:33.840
traditionally text editors have relied

0:00:31.840,0:00:34.960
very heavily on regular expressions for

0:00:33.840,0:00:38.320
this

0:00:34.960,0:00:39.280
e-max is no different most language

0:00:38.320,0:00:40.879
major modes use

0:00:39.280,0:00:42.960
regular expressions for syntax

0:00:40.879,0:00:46.239
highlighting code navigation

0:00:42.960,0:00:47.440
folding indexing and so on regular

0:00:46.239,0:00:50.559
expressions are

0:00:47.440,0:00:53.600
problematic for a couple of reasons

0:00:50.559,0:00:54.000
they're slow and inaccurate they also

0:00:53.600,0:00:56.800
make

0:00:54.000,0:00:57.440
the code hard to read and write

0:00:56.800,0:00:59.199
sometimes

0:00:57.440,0:01:01.199
it's because the regular expressions

0:00:59.199,0:01:04.000
themselves are very hairy

0:01:01.199,0:01:05.199
and sometimes because they are just not

0:01:04.000,0:01:07.840
powerful enough

0:01:05.199,0:01:11.200
some helper code is usually needed to

0:01:07.840,0:01:13.280
pass more intricate language features

0:01:11.200,0:01:16.159
that also illustrates the core problem

0:01:13.280,0:01:18.400
with regular expressions

0:01:16.159,0:01:21.119
in that they are not powerful enough to

0:01:18.400,0:01:22.640
pass programming languages

0:01:21.119,0:01:25.040
an example feature that regular

0:01:22.640,0:01:27.520
expressions cannot handle very well

0:01:25.040,0:01:28.320
is string interpolation which is a very

0:01:27.520,0:01:31.680
common feature

0:01:28.320,0:01:34.079
in many modern programming languages

0:01:31.680,0:01:35.840
it would be much nicer if image somehow

0:01:34.079,0:01:36.479
had structural understanding of source

0:01:35.840,0:01:39.439
code

0:01:36.479,0:01:39.439
like ides do

0:01:39.520,0:01:42.960
there have been multiple efforts to

0:01:41.119,0:01:45.280
bring this kind of programming language

0:01:42.960,0:01:47.119
understanding into Emacs

0:01:45.280,0:01:48.640
there are language specific persons

0:01:47.119,0:01:50.240
written in elise

0:01:48.640,0:01:52.320
they can be thought of as the next

0:01:50.240,0:01:54.960
logical step of the glue code on top

0:01:52.320,0:01:56.000
of tribal expressions moving from

0:01:54.960,0:01:58.079
partial local

0:01:56.000,0:01:59.840
pattern recognition into a full-fledged

0:01:58.079,0:02:01.439
parser

0:01:59.840,0:02:03.040
the most prominent example of this

0:02:01.439,0:02:06.159
approach is probably the famous

0:02:03.040,0:02:06.159
js2 mode

0:02:06.479,0:02:12.959
however this approach has several issues

0:02:10.080,0:02:13.680
parsing is computationally expensive and

0:02:12.959,0:02:16.800
imagine

0:02:13.680,0:02:18.400
is not good at that kind of stuff

0:02:16.800,0:02:20.840
furthermore maintenance is very

0:02:18.400,0:02:22.160
troublesome in order to work on these

0:02:20.840,0:02:23.599
process

0:02:22.160,0:02:25.599
first you have to know at least well

0:02:23.599,0:02:27.760
enough and then you have to be

0:02:25.599,0:02:30.319
comfortable with writing a

0:02:27.760,0:02:32.080
recursive ascendant parser while

0:02:30.319,0:02:34.000
constantly keeping up with changes to

0:02:32.080,0:02:36.879
the language itself

0:02:34.000,0:02:39.360
which can be evolving very quickly like

0:02:36.879,0:02:41.599
javascript for example

0:02:39.360,0:02:45.680
together these constraints significantly

0:02:41.599,0:02:47.760
reduce the pull of potential maintenance

0:02:45.680,0:02:49.680
the biggest issue though in my opinion

0:02:47.760,0:02:52.879
is lack of the set of generic

0:02:49.680,0:02:54.319
and reusable apis this makes them very

0:02:52.879,0:02:55.920
hard to use

0:02:54.319,0:02:57.920
for minor modes that want to deal with

0:02:55.920,0:02:59.920
cross-cutting concerns across multiple

0:02:57.920,0:03:01.760
languages

0:02:59.920,0:03:03.599
the other approach which has been

0:03:01.760,0:03:04.319
gaining a lot of momentum in recent

0:03:03.599,0:03:06.560
years

0:03:04.319,0:03:08.159
is externalizing language understanding

0:03:06.560,0:03:12.239
to another process

0:03:08.159,0:03:14.480
also known as language server protocol

0:03:12.239,0:03:16.560
this second approach is actually a very

0:03:14.480,0:03:18.400
interesting one

0:03:16.560,0:03:21.280
my decoupling language understanding

0:03:18.400,0:03:23.760
from the editing facility itself

0:03:21.280,0:03:25.120
the usb servers can attract a lot more

0:03:23.760,0:03:28.959
contributors

0:03:25.120,0:03:32.400
which makes maintenance easier however

0:03:28.959,0:03:34.720
they also have several issues available

0:03:32.400,0:03:36.000
being a separate process they are

0:03:34.720,0:03:39.920
usually more resource

0:03:36.000,0:03:42.159
intensive and depending on the language

0:03:39.920,0:03:44.640
the usb server itself can bring with it

0:03:42.159,0:03:47.680
a host of additional dependencies

0:03:44.640,0:03:50.560
external to Emacs which may message to

0:03:47.680,0:03:50.560
install and manage

0:03:50.640,0:03:55.120
furthermore json over rpc has pretty

0:03:53.760,0:03:57.840
high latency

0:03:55.120,0:04:00.879
for one-off tasks like jumping to source

0:03:57.840,0:04:03.040
or on-demand completion is great

0:04:00.879,0:04:06.000
but for things like code highlighting

0:04:03.040,0:04:08.319
the latency is just too much

0:04:06.000,0:04:10.480
I was using rust and I was following the

0:04:08.319,0:04:11.760
community effort to improve its id

0:04:10.480,0:04:13.680
support

0:04:11.760,0:04:15.760
hoping to integrate some of that into

0:04:13.680,0:04:17.600
Emacs itself

0:04:15.760,0:04:19.759
then I heard someone from community

0:04:17.600,0:04:23.280
mention tree sitter

0:04:19.759,0:04:23.280
and I decided to check it out

0:04:23.360,0:04:28.720
basically trisita is an incremental

0:04:25.520,0:04:31.000
parsing library and a parser generator

0:04:28.720,0:04:33.040
it was introduced by the item editor in

0:04:31.000,0:04:35.680
2018

0:04:33.040,0:04:36.960
besides item is also being integrated

0:04:35.680,0:04:41.040
into the neo-vim

0:04:36.960,0:04:42.479
editor and github is using it to power

0:04:41.040,0:04:45.600
their source code analysis and

0:04:42.479,0:04:45.600
navigation features

0:04:45.840,0:04:49.199
it is written in c and can be compiled

0:04:48.639,0:04:53.120
for all

0:04:49.199,0:04:56.080
major platforms it can even be compiled

0:04:53.120,0:04:57.600
to web assembly to run on the web that's

0:04:56.080,0:05:00.400
how github is using it

0:04:57.600,0:05:00.400
on their website

0:05:00.800,0:05:05.840
so why is trisita an interesting

0:05:02.960,0:05:07.360
solution to this problem

0:05:05.840,0:05:10.000
there are multiple features that make it

0:05:07.360,0:05:12.400
an attractive option

0:05:10.000,0:05:13.680
it is designed to be fast by being

0:05:12.400,0:05:15.680
incremental

0:05:13.680,0:05:18.160
the initial parts of a typical big fight

0:05:15.680,0:05:20.240
can take tens of milliseconds

0:05:18.160,0:05:22.560
while subsequent incremental processes

0:05:20.240,0:05:24.720
are sub milliseconds

0:05:22.560,0:05:26.240
it achieves this by using structural

0:05:24.720,0:05:29.360
sharing

0:05:26.240,0:05:32.960
meaning replacing only affected nodes

0:05:29.360,0:05:36.000
in the old tree when it needs to

0:05:32.960,0:05:37.120
also unlike lsp being in the same

0:05:36.000,0:05:40.639
process

0:05:37.120,0:05:42.880
it has much lower latency

0:05:40.639,0:05:44.960
secondly it provides a uniform

0:05:42.880,0:05:47.039
programming interface

0:05:44.960,0:05:48.720
the same data structures and functions

0:05:47.039,0:05:50.400
work on parse trees of different

0:05:48.720,0:05:52.160
languages

0:05:50.400,0:05:54.160
syntax knows of different languages

0:05:52.160,0:05:57.360
differ only by their types

0:05:54.160,0:05:58.960
and their possible child nodes this

0:05:57.360,0:06:02.000
is a big advantage over language

0:05:58.960,0:06:02.000
specific parcels

0:06:02.240,0:06:06.880
thirdly it's written in self-contained

0:06:04.880,0:06:09.680
embeddable c

0:06:06.880,0:06:10.400
as I mentioned previously it can even be

0:06:09.680,0:06:13.759
compiled

0:06:10.400,0:06:15.199
to webassembly this makes integrating it

0:06:13.759,0:06:18.240
into various editors

0:06:15.199,0:06:21.840
quite easy without having to install

0:06:18.240,0:06:21.840
any external dependencies

0:06:22.880,0:06:28.000
one thing that is not mentioned here is

0:06:24.639,0:06:31.039
that being a parcel generator

0:06:28.000,0:06:34.880
scrummers are declarative

0:06:31.039,0:06:36.720
together with being editor independent

0:06:34.880,0:06:38.160
this makes the pool of potential

0:06:36.720,0:06:42.400
contributors

0:06:38.160,0:06:45.520
much larger so I was convinced

0:06:42.400,0:06:48.000
that trisito is a good fit for Emacs

0:06:45.520,0:06:48.720
last year I started writing the bindings

0:06:48.000,0:06:50.960
using

0:06:48.720,0:06:53.280
dynamic model support introduced in imax

0:06:50.960,0:06:55.360
25.

0:06:53.280,0:06:58.479
dynamic module means there is platform

0:06:55.360,0:07:00.560
specific native code involved

0:06:58.479,0:07:02.880
but since they are pre-compiled binaries

0:07:00.560,0:07:06.319
for the three major platforms

0:07:02.880,0:07:08.319
it should work in most places currently

0:07:06.319,0:07:09.440
the core functionalities are in a pretty

0:07:08.319,0:07:12.560
good shape

0:07:09.440,0:07:14.840
syntax highlighting is working nicely

0:07:12.560,0:07:16.080
the whole thing is split into three

0:07:14.840,0:07:17.759
packages

0:07:16.080,0:07:20.319
tree sitter is the main package that

0:07:17.759,0:07:22.800
other packages should depend on

0:07:20.319,0:07:24.000
tree system lens is the language bundle

0:07:22.800,0:07:27.199
that includes support

0:07:24.000,0:07:30.080
for most common languages

0:07:27.199,0:07:32.160
and finally the core apis are in the

0:07:30.080,0:07:36.160
package tsc

0:07:32.160,0:07:38.800
which stands for trees the core

0:07:36.160,0:07:41.919
it is the implicit dependency of the

0:07:38.800,0:07:41.919
three-seater package

0:07:43.520,0:07:47.520
the main package includes the miner mode

0:07:46.000,0:07:49.840
3-seater mode

0:07:47.520,0:07:52.560
this provides the base for other major

0:07:49.840,0:07:55.280
or minor modes to build on

0:07:52.560,0:07:55.840
using image change tracking hooks it

0:07:55.280,0:07:58.080
enables

0:07:55.840,0:08:00.800
incremental parsing and provides a

0:07:58.080,0:08:04.080
syntax tree that is always up to date

0:08:00.800,0:08:06.560
after any edits in a buffer

0:08:04.080,0:08:10.080
there is also a basic debug mode that

0:08:06.560,0:08:13.360
shows the parse tree in another buffer

0:08:10.080,0:08:15.759
here is a quick demo

0:08:13.360,0:08:17.520
here I mean an empty python buffer with

0:08:15.759,0:08:19.440
three seater enabled

0:08:17.520,0:08:26.560
I'm going to turn on the debug mode to

0:08:19.440,0:08:28.720
see the parse tree

0:08:26.560,0:08:30.639
since the buffer is empty there is only

0:08:28.720,0:08:33.279
one node in the syntax tree the top

0:08:30.639,0:08:41.839
level module node

0:08:33.279,0:08:41.839
let's try typing some code

0:09:11.040,0:09:14.640
as you can see as I type into the python

0:09:13.600,0:09:19.120
buffer

0:09:14.640,0:09:21.120
the syntax tree updates in real time

0:09:19.120,0:09:23.279
the other minor mode included in the

0:09:21.120,0:09:26.640
main package is 3-seater

0:09:23.279,0:09:28.480
hl mode it overrides font-lock mode and

0:09:26.640,0:09:31.839
provides its own set of phases

0:09:28.480,0:09:32.800
and customization options it is query

0:09:31.839,0:09:35.200
driven

0:09:32.800,0:09:36.240
that means instead of regular

0:09:35.200,0:09:38.720
expressions

0:09:36.240,0:09:40.320
it uses a list like query language to

0:09:38.720,0:09:43.760
map syntax notes

0:09:40.320,0:09:45.760
to highlighting phrases I'm going to

0:09:43.760,0:09:51.839
open a python file with small snippets

0:09:45.760,0:09:51.839
that showcase syntax highlighting

0:09:54.320,0:09:59.279
so this is the default highlighting

0:09:55.920,0:09:59.279
provided by python mode

0:10:00.880,0:10:04.640
this is the highlighting enabled by tree

0:10:02.839,0:10:07.680
sitter

0:10:04.640,0:10:11.680
as you can see string interpolation

0:10:07.680,0:10:15.440
and decorators are highlighted correctly

0:10:11.680,0:10:15.440
function calls are also highlighted

0:10:17.440,0:10:21.839
you can also note that property

0:10:20.240,0:10:24.640
assessors

0:10:21.839,0:10:27.200
and property assignments are highlighted

0:10:24.640,0:10:27.200
differently

0:10:27.440,0:10:30.880
what I like the most about this is that

0:10:29.360,0:10:32.640
new bindings are consistently

0:10:30.880,0:10:36.320
highlighted

0:10:32.640,0:10:39.760
this included local variable

0:10:36.320,0:10:42.480
function parameters and property

0:10:39.760,0:10:42.480
mutations

0:10:45.760,0:10:49.279
before going through the three queries

0:10:48.000,0:10:51.680
and the syntax highlighting

0:10:49.279,0:10:53.760
customization options

0:10:51.680,0:10:55.040
let's take a brief look at the core data

0:10:53.760,0:10:58.079
structures and functions

0:10:55.040,0:10:59.839
that tree sitter provides

0:10:58.079,0:11:02.240
so parsing is done with the help of a

0:10:59.839,0:11:04.160
generic parser object

0:11:02.240,0:11:06.000
a single parser object can be used to

0:11:04.160,0:11:08.320
pass different languages

0:11:06.000,0:11:09.279
by sending different language objects to

0:11:08.320,0:11:10.880
it

0:11:09.279,0:11:14.079
the language objects themselves are

0:11:10.880,0:11:16.079
loaded from shared libraries

0:11:14.079,0:11:17.360
since three seater mode already handles

0:11:16.079,0:11:19.440
the parsing part

0:11:17.360,0:11:20.800
we will instead focus on the functions

0:11:19.440,0:11:24.720
that inspect nodes

0:11:20.800,0:11:24.720
and in the resulting path tree

0:11:25.279,0:11:43.839
we can ask tree sitter what is the

0:11:27.200,0:11:43.839
syntax node at point

0:11:44.240,0:11:48.480
uh is it an opaque object so this is not

0:11:47.200,0:11:57.839
very useful

0:11:48.480,0:11:57.839
we can instead ask what is its type

0:12:03.760,0:12:08.959
so his type is the symbol comparison

0:12:06.560,0:12:11.600
operator

0:12:08.959,0:12:13.680
trees there are two kinds of nodes

0:12:11.600,0:12:15.519
anonymous nodes and named nodes

0:12:13.680,0:12:17.040
anonymous nodes correspond to simple

0:12:15.519,0:12:19.839
grammar elements

0:12:17.040,0:12:21.279
like keywords operators punctuations and

0:12:19.839,0:12:24.160
so on

0:12:21.279,0:12:25.920
name nodes on the other hand grammar

0:12:24.160,0:12:26.639
elements that are interesting enough for

0:12:25.920,0:12:30.320
their own

0:12:26.639,0:12:31.839
to have a name like an identifier an

0:12:30.320,0:12:35.200
expression

0:12:31.839,0:12:35.200
or a function definition

0:12:35.440,0:12:41.519
name node types are symbols while

0:12:37.760,0:12:41.519
anonymous node types are strings

0:12:42.639,0:12:49.519
for example if we are on this

0:12:46.320,0:12:49.519
comparison operator

0:12:49.760,0:12:53.839
the node type should be a string

0:12:55.920,0:12:58.959
we can also get other information about

0:12:57.920,0:13:07.839
the node

0:12:58.959,0:13:07.839
for example what is this text

0:13:09.680,0:13:35.839
or where it is in the buffer

0:13:20.800,0:13:35.839
or what is its parent

0:13:43.199,0:13:46.839
there are many other apis to query or

0:13:46.160,0:13:49.839
not

0:13:46.839,0:13:49.839
properties

0:13:52.639,0:13:58.240
tree sitter allows searching for

0:13:54.399,0:14:01.440
structural patterns within a parse tree

0:13:58.240,0:14:03.519
it does so through a list like language

0:14:01.440,0:14:04.639
this language supports by the matching

0:14:03.519,0:14:07.760
by node types

0:14:04.639,0:14:10.079
field names and predicates

0:14:07.760,0:14:12.639
it also allows capturing nodes for

0:14:10.079,0:14:17.839
further processing

0:14:12.639,0:14:17.839
let's try to see some examples

0:14:37.680,0:14:43.839
so in this very simple query we just

0:14:41.040,0:14:46.399
try to highlight all the identifiers in

0:14:43.839,0:14:46.399
the buffer

0:14:49.040,0:14:53.120
this s side tells trisito to capture a

0:14:51.920,0:14:55.839
node

0:14:53.120,0:14:57.360
in the context of the query builder it's

0:14:55.839,0:15:00.320
not very important

0:14:57.360,0:15:01.760
but in normal highlighting query this

0:15:00.320,0:15:05.920
will determine

0:15:01.760,0:15:05.920
the face used to highlight the note

0:15:06.639,0:15:10.320
suppose we want to capture all the

0:15:08.800,0:15:13.519
function names

0:15:10.320,0:15:27.839
instead of just any identifier

0:15:13.519,0:15:27.839
you can improve the query like this

0:15:29.440,0:15:32.639
uh this will highlight the whole

0:15:31.600,0:15:35.519
definition

0:15:32.639,0:15:36.399
but we only want to capture the function

0:15:35.519,0:15:39.600
name

0:15:36.399,0:15:42.800
which means the identifier

0:15:39.600,0:15:46.320
here so we

0:15:42.800,0:15:48.639
move the capture to after the identifier

0:15:46.320,0:15:48.639
node

0:15:49.600,0:15:52.959
if we want to capture the class names as

0:15:51.759,0:16:09.839
well

0:15:52.959,0:16:09.839
we just add another pattern

0:16:10.079,0:16:14.399
let's look at a more practical example

0:16:20.320,0:16:23.759
here we can see that single quotes

0:16:22.959,0:16:25.600
strings and

0:16:23.759,0:16:27.279
double quotes screens are highlighted

0:16:25.600,0:16:30.399
the same

0:16:27.279,0:16:33.440
but in some places

0:16:30.399,0:16:35.440
because of some coding conventions

0:16:33.440,0:16:37.279
it may be desirable to highlight them

0:16:35.440,0:16:39.680
differently for example if

0:16:37.279,0:16:40.880
the string is single quoted we may want

0:16:39.680,0:16:43.759
to highlight it

0:16:40.880,0:16:43.759
as a constant

0:16:44.399,0:16:47.600
let's try to see whether we can

0:16:46.160,0:16:51.839
distinguish these

0:16:47.600,0:16:51.839
two cases

0:16:56.240,0:17:00.160
so here we get all the strings

0:17:00.639,0:17:04.559
if we want to see if it's single quotes

0:17:04.079,0:17:07.520
or

0:17:04.559,0:17:07.520
double quote strings

0:17:08.799,0:17:12.480
we can try looking at the first

0:17:11.039,0:17:15.280
character

0:17:12.480,0:17:16.720
of the string I mean the first character

0:17:15.280,0:17:19.360
of the note

0:17:16.720,0:17:33.600
to check whether it's a single quote or

0:17:19.360,0:17:36.080
a double quote

0:17:33.600,0:17:36.799
yeah so for that we use the three

0:17:36.080,0:17:40.160
setters

0:17:36.799,0:17:43.360
support for predicate in this case

0:17:40.160,0:17:46.080
we use a match predicate

0:17:43.360,0:17:46.799
to check whether the string where the

0:17:46.080,0:17:50.320
note

0:17:46.799,0:17:51.280
starts with a single quote and with this

0:17:50.320,0:17:55.520
pattern

0:17:51.280,0:17:55.520
we only capture the single quotes

0:17:58.840,0:18:03.760
strings

0:18:00.400,0:18:07.760
let's try to give it a different face

0:18:03.760,0:18:07.760
so we copy the pattern

0:18:13.039,0:18:16.640
and we add this pattern

0:18:18.640,0:18:21.760
pop item only

0:18:25.120,0:18:31.440
but we also want to give the

0:18:28.400,0:18:36.320
capture a different name

0:18:31.440,0:18:36.320
let's say we want to highlight it as a

0:18:40.840,0:18:43.840
keyword

0:18:46.559,0:18:57.840
and now if we refresh the buffer

0:19:06.320,0:19:10.320
we see that single quote strings are

0:19:08.799,0:19:12.880
highlighted as

0:19:10.320,0:19:12.880
keywords

0:19:14.400,0:19:19.200
the highlighting patterns can also be

0:19:16.400,0:19:23.280
set for a single project

0:19:19.200,0:19:23.280
using directory local variable

0:19:23.440,0:19:30.000
for example let's take a look at

0:19:26.880,0:19:30.000
ems source code

0:19:35.760,0:19:43.760
so in image c source there are a lot of

0:19:40.400,0:19:47.679
uses of these different macros

0:19:43.760,0:19:50.400
to define functions

0:19:47.679,0:19:50.400
and you can see

0:19:51.200,0:19:55.760
this is actually the function name but

0:19:53.520,0:19:59.120
it's highlighted as the

0:19:55.760,0:20:03.679
string so what we want

0:19:59.120,0:20:07.600
is to somehow recognize this pattern

0:20:03.679,0:20:11.280
and highlight it

0:20:07.600,0:20:14.559
as highlight this part

0:20:11.280,0:20:17.679
with the function phase instead

0:20:14.559,0:20:20.240
in order to do that

0:20:17.679,0:20:21.760
we put a pattern in this project

0:20:20.240,0:20:24.880
directory local

0:20:21.760,0:20:24.880
settings file

0:20:31.760,0:20:37.760
so we can put this button in the c

0:20:34.799,0:20:37.760
mode section

0:20:40.159,0:20:50.480
and now if we enable tree sitter

0:20:48.000,0:20:52.720
you can see that this is the highlighted

0:20:50.480,0:20:52.720
uh

0:20:53.200,0:20:56.559
as a normal function definition so this

0:20:55.520,0:21:00.400
is the function

0:20:56.559,0:21:00.400
face like we wanted

0:21:01.200,0:21:06.080
the pattern for this is actually pretty

0:21:03.760,0:21:06.080
simple

0:21:07.200,0:21:09.919
it's only

0:21:10.720,0:21:17.440
only this part so

0:21:14.720,0:21:19.679
if it's a function call where the name

0:21:17.440,0:21:21.600
of the function is different

0:21:19.679,0:21:24.159
then we highlight the different as a

0:21:21.600,0:21:24.159
keyword

0:21:24.240,0:21:28.159
and then the first string element we

0:21:27.360,0:21:31.840
highlighted

0:21:28.159,0:21:31.840
as a function name

0:21:35.360,0:21:39.280
since the language objects are actually

0:21:37.679,0:21:40.799
native code

0:21:39.280,0:21:43.440
they have to be compiled for each

0:21:40.799,0:21:45.600
platform that we want to support

0:21:43.440,0:21:48.159
this will become a big obstacle for

0:21:45.600,0:21:50.240
3-seater adoption

0:21:48.159,0:21:52.960
therefore I've created a language window

0:21:50.240,0:21:54.960
package 3-seater length

0:21:52.960,0:21:56.320
that takes care of pre-compiling the

0:21:54.960,0:21:59.679
grammars the

0:21:56.320,0:22:01.600
most common grammars for all three major

0:21:59.679,0:22:04.080
platforms

0:22:01.600,0:22:05.360
it also takes care of distributing these

0:22:04.080,0:22:08.080
binaries

0:22:05.360,0:22:11.280
and provides some highlighting queries

0:22:08.080,0:22:11.280
for some of the languages

0:22:11.440,0:22:15.919
it should be noted that this package

0:22:13.760,0:22:19.520
should be treated as a temporary

0:22:15.919,0:22:19.520
distribution mechanism only

0:22:19.919,0:22:24.720
to help with bootstrapping three-seaters

0:22:22.240,0:22:27.760
adoption

0:22:24.720,0:22:29.760
the plan is that eventually these files

0:22:27.760,0:22:32.480
should be provided by the language major

0:22:29.760,0:22:35.120
modes themselves

0:22:32.480,0:22:36.320
but in order to do that we need better

0:22:35.120,0:22:40.240
tooling

0:22:36.320,0:22:42.559
so we're not there yet

0:22:40.240,0:22:43.280
since the call already works reasonably

0:22:42.559,0:22:44.640
well

0:22:43.280,0:22:46.320
there are several areas that would

0:22:44.640,0:22:48.960
benefit from the community's

0:22:46.320,0:22:48.960
contribution

0:22:49.120,0:22:52.640
so three seaters upstream language

0:22:51.520,0:22:54.400
prepositories

0:22:52.640,0:22:55.679
already contain highlighting queries on

0:22:54.400,0:22:58.480
their own

0:22:55.679,0:23:00.480
however they are pretty basic and they

0:22:58.480,0:23:02.559
may not fit well with existing emax

0:23:00.480,0:23:04.320
conventions

0:23:02.559,0:23:07.120
therefore the language bundle has its

0:23:04.320,0:23:10.559
own set of highlighting queries

0:23:07.120,0:23:11.600
this requires maintenance until language

0:23:10.559,0:23:13.760
measurements adopt

0:23:11.600,0:23:16.240
three sitter and maintain the queries on

0:23:13.760,0:23:16.240
their own

0:23:16.640,0:23:22.000
the queries are actually quite easy to

0:23:18.480,0:23:24.240
write as you've already seen

0:23:22.000,0:23:25.360
you just need to be familiar with the

0:23:24.240,0:23:30.000
language

0:23:25.360,0:23:32.880
familiar enough to come up with sensible

0:23:30.000,0:23:32.880
highlighting patterns

0:23:35.200,0:23:39.679
and if you are a maintainer of a

0:23:37.600,0:23:42.320
language major mode

0:23:39.679,0:23:43.360
you may want to consider integrating

0:23:42.320,0:23:46.960
tree sitter into

0:23:43.360,0:23:50.080
your mode initially maybe as an

0:23:46.960,0:23:53.279
optional feature the integration is

0:23:50.080,0:23:56.640
actually pretty straightforward

0:23:53.279,0:24:00.880
especially for syntax highlighting

0:23:56.640,0:24:00.880
or alternatively

0:24:01.520,0:24:04.640
you can also try writing a new major

0:24:03.760,0:24:08.000
mode

0:24:04.640,0:24:11.360
from scratch that relies on tree sitter

0:24:08.000,0:24:11.360
from the very beginning

0:24:12.559,0:24:19.679
the code for such a major mode is

0:24:16.320,0:24:23.200
quite simple for example

0:24:19.679,0:24:26.240
this is the proposed

0:24:23.200,0:24:30.720
what mode for web assembly

0:24:26.240,0:24:30.720
the code is just

0:24:31.039,0:24:37.120
like one page of code not

0:24:34.559,0:24:37.120
not a lot

0:24:39.520,0:24:46.559
you can also try writing new minor modes

0:24:42.720,0:24:50.080
or writing integration packages

0:24:46.559,0:24:50.880
for example a lot of package a lot of

0:24:50.080,0:24:54.559
packages

0:24:50.880,0:24:58.840
may benefit from tree sitter integration

0:24:54.559,0:25:01.840
but no one has written the integration

0:24:58.840,0:25:01.840
yet

0:25:02.960,0:25:06.720
if you are interested in 3-seater you

0:25:05.039,0:25:10.320
can use these links to

0:25:06.720,0:25:11.440
learn more about it I think that's it

0:25:10.320,0:25:18.159
for me today

0:25:11.440,0:25:18.159
I'm happy to answer any questions