summaryrefslogtreecommitdiffstats
path: root/2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguye...
diff options
context:
space:
mode:
authorSacha Chua <sacha@sachachua.com>2020-12-13 00:06:32 -0500
committerSacha Chua <sacha@sachachua.com>2020-12-13 00:06:32 -0500
commitb98df6fbe2a5c48013cfca81a95a5af41e202d07 (patch)
treefc20f6aca84b73f50eaae13837e2ce6999c0b841 /2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.vtt
parent315add08d9c7f73fb3105940ad5230fb6b050fc2 (diff)
downloademacsconf-wiki-b98df6fbe2a5c48013cfca81a95a5af41e202d07.tar.xz
emacsconf-wiki-b98df6fbe2a5c48013cfca81a95a5af41e202d07.zip
Actually post subtitles, I think
Diffstat (limited to '2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.vtt')
-rw-r--r--2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.vtt1522
1 files changed, 1522 insertions, 0 deletions
diff --git a/2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.vtt b/2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.vtt
new file mode 100644
index 00000000..99133c78
--- /dev/null
+++ b/2020/subtitles/emacsconf-2020--23-incremental-parsing-with-emacs-tree-sitter--tuan-anh-nguyen-autogen.vtt
@@ -0,0 +1,1522 @@
+WEBVTT
+
+00:00:01.520 --> 00:00:04.400
+hello everyone my name is toniang
+
+00:00:04.400 --> 00:00:07.200
+I've been using amax for about 10 years
+
+00:00:07.200 --> 00:00:09.280
+today I'm going to talk about 360
+
+00:00:09.280 --> 00:00:11.519
+a new imax package that allows ems to
+
+00:00:11.519 --> 00:00:13.759
+pass multiple programming languages
+
+00:00:13.759 --> 00:00:17.840
+in real time
+
+00:00:17.840 --> 00:00:21.840
+so what is the problem statement
+
+00:00:21.840 --> 00:00:23.359
+in order to support programming
+
+00:00:23.359 --> 00:00:24.960
+functionalities for a particular
+
+00:00:24.960 --> 00:00:25.760
+language
+
+00:00:25.760 --> 00:00:27.680
+a text editor needs to have some degree
+
+00:00:27.680 --> 00:00:29.679
+of language understanding
+
+00:00:29.679 --> 00:00:31.840
+traditionally text editors have relied
+
+00:00:31.840 --> 00:00:33.840
+very heavily on regular expressions for
+
+00:00:33.840 --> 00:00:34.960
+this
+
+00:00:34.960 --> 00:00:38.320
+e-max is no different most language
+
+00:00:38.320 --> 00:00:39.280
+major modes use
+
+00:00:39.280 --> 00:00:40.879
+regular expressions for syntax
+
+00:00:40.879 --> 00:00:42.960
+highlighting code navigation
+
+00:00:42.960 --> 00:00:46.239
+folding indexing and so on regular
+
+00:00:46.239 --> 00:00:47.440
+expressions are
+
+00:00:47.440 --> 00:00:50.559
+problematic for a couple of reasons
+
+00:00:50.559 --> 00:00:53.600
+they're slow and inaccurate they also
+
+00:00:53.600 --> 00:00:54.000
+make
+
+00:00:54.000 --> 00:00:56.800
+the code hard to read and write
+
+00:00:56.800 --> 00:00:57.440
+sometimes
+
+00:00:57.440 --> 00:00:59.199
+it's because the regular expressions
+
+00:00:59.199 --> 00:01:01.199
+themselves are very hairy
+
+00:01:01.199 --> 00:01:04.000
+and sometimes because they are just not
+
+00:01:04.000 --> 00:01:05.199
+powerful enough
+
+00:01:05.199 --> 00:01:07.840
+some helper code is usually needed to
+
+00:01:07.840 --> 00:01:11.200
+pass more intricate language features
+
+00:01:11.200 --> 00:01:13.280
+that also illustrates the core problem
+
+00:01:13.280 --> 00:01:16.159
+with regular expressions
+
+00:01:16.159 --> 00:01:18.400
+in that they are not powerful enough to
+
+00:01:18.400 --> 00:01:21.119
+pass programming languages
+
+00:01:21.119 --> 00:01:22.640
+an example feature that regular
+
+00:01:22.640 --> 00:01:25.040
+expressions cannot handle very well
+
+00:01:25.040 --> 00:01:27.520
+is string interpolation which is a very
+
+00:01:27.520 --> 00:01:28.320
+common feature
+
+00:01:28.320 --> 00:01:31.680
+in many modern programming languages
+
+00:01:31.680 --> 00:01:34.079
+it would be much nicer if image somehow
+
+00:01:34.079 --> 00:01:35.840
+had structural understanding of source
+
+00:01:35.840 --> 00:01:36.479
+code
+
+00:01:36.479 --> 00:01:39.520
+like ides do
+
+00:01:39.520 --> 00:01:41.119
+there have been multiple efforts to
+
+00:01:41.119 --> 00:01:42.960
+bring this kind of programming language
+
+00:01:42.960 --> 00:01:45.280
+understanding into Emacs
+
+00:01:45.280 --> 00:01:47.119
+there are language specific persons
+
+00:01:47.119 --> 00:01:48.640
+written in elise
+
+00:01:48.640 --> 00:01:50.240
+they can be thought of as the next
+
+00:01:50.240 --> 00:01:52.320
+logical step of the glue code on top
+
+00:01:52.320 --> 00:01:54.960
+of tribal expressions moving from
+
+00:01:54.960 --> 00:01:56.000
+partial local
+
+00:01:56.000 --> 00:01:58.079
+pattern recognition into a full-fledged
+
+00:01:58.079 --> 00:01:59.840
+parser
+
+00:01:59.840 --> 00:02:01.439
+the most prominent example of this
+
+00:02:01.439 --> 00:02:03.040
+approach is probably the famous
+
+00:02:03.040 --> 00:02:06.479
+js2 mode
+
+00:02:06.479 --> 00:02:10.080
+however this approach has several issues
+
+00:02:10.080 --> 00:02:12.959
+parsing is computationally expensive and
+
+00:02:12.959 --> 00:02:13.680
+imagine
+
+00:02:13.680 --> 00:02:16.800
+is not good at that kind of stuff
+
+00:02:16.800 --> 00:02:18.400
+furthermore maintenance is very
+
+00:02:18.400 --> 00:02:20.840
+troublesome in order to work on these
+
+00:02:20.840 --> 00:02:22.160
+process
+
+00:02:22.160 --> 00:02:23.599
+first you have to know at least well
+
+00:02:23.599 --> 00:02:25.599
+enough and then you have to be
+
+00:02:25.599 --> 00:02:27.760
+comfortable with writing a
+
+00:02:27.760 --> 00:02:30.319
+recursive ascendant parser while
+
+00:02:30.319 --> 00:02:32.080
+constantly keeping up with changes to
+
+00:02:32.080 --> 00:02:34.000
+the language itself
+
+00:02:34.000 --> 00:02:36.879
+which can be evolving very quickly like
+
+00:02:36.879 --> 00:02:39.360
+javascript for example
+
+00:02:39.360 --> 00:02:41.599
+together these constraints significantly
+
+00:02:41.599 --> 00:02:45.680
+reduce the pull of potential maintenance
+
+00:02:45.680 --> 00:02:47.760
+the biggest issue though in my opinion
+
+00:02:47.760 --> 00:02:49.680
+is lack of the set of generic
+
+00:02:49.680 --> 00:02:52.879
+and reusable apis this makes them very
+
+00:02:52.879 --> 00:02:54.319
+hard to use
+
+00:02:54.319 --> 00:02:55.920
+for minor modes that want to deal with
+
+00:02:55.920 --> 00:02:57.920
+cross-cutting concerns across multiple
+
+00:02:57.920 --> 00:02:59.920
+languages
+
+00:02:59.920 --> 00:03:01.760
+the other approach which has been
+
+00:03:01.760 --> 00:03:03.599
+gaining a lot of momentum in recent
+
+00:03:03.599 --> 00:03:04.319
+years
+
+00:03:04.319 --> 00:03:06.560
+is externalizing language understanding
+
+00:03:06.560 --> 00:03:08.159
+to another process
+
+00:03:08.159 --> 00:03:12.239
+also known as language server protocol
+
+00:03:12.239 --> 00:03:14.480
+this second approach is actually a very
+
+00:03:14.480 --> 00:03:16.560
+interesting one
+
+00:03:16.560 --> 00:03:18.400
+my decoupling language understanding
+
+00:03:18.400 --> 00:03:21.280
+from the editing facility itself
+
+00:03:21.280 --> 00:03:23.760
+the usb servers can attract a lot more
+
+00:03:23.760 --> 00:03:25.120
+contributors
+
+00:03:25.120 --> 00:03:28.959
+which makes maintenance easier however
+
+00:03:28.959 --> 00:03:32.400
+they also have several issues available
+
+00:03:32.400 --> 00:03:34.720
+being a separate process they are
+
+00:03:34.720 --> 00:03:36.000
+usually more resource
+
+00:03:36.000 --> 00:03:39.920
+intensive and depending on the language
+
+00:03:39.920 --> 00:03:42.159
+the usb server itself can bring with it
+
+00:03:42.159 --> 00:03:44.640
+a host of additional dependencies
+
+00:03:44.640 --> 00:03:47.680
+external to Emacs which may message to
+
+00:03:47.680 --> 00:03:50.640
+install and manage
+
+00:03:50.640 --> 00:03:53.760
+furthermore json over rpc has pretty
+
+00:03:53.760 --> 00:03:55.120
+high latency
+
+00:03:55.120 --> 00:03:57.840
+for one-off tasks like jumping to source
+
+00:03:57.840 --> 00:04:00.879
+or on-demand completion is great
+
+00:04:00.879 --> 00:04:03.040
+but for things like code highlighting
+
+00:04:03.040 --> 00:04:06.000
+the latency is just too much
+
+00:04:06.000 --> 00:04:08.319
+I was using rust and I was following the
+
+00:04:08.319 --> 00:04:10.480
+community effort to improve its id
+
+00:04:10.480 --> 00:04:11.760
+support
+
+00:04:11.760 --> 00:04:13.680
+hoping to integrate some of that into
+
+00:04:13.680 --> 00:04:15.760
+Emacs itself
+
+00:04:15.760 --> 00:04:17.600
+then I heard someone from community
+
+00:04:17.600 --> 00:04:19.759
+mention tree sitter
+
+00:04:19.759 --> 00:04:23.360
+and I decided to check it out
+
+00:04:23.360 --> 00:04:25.520
+basically trisita is an incremental
+
+00:04:25.520 --> 00:04:28.720
+parsing library and a parser generator
+
+00:04:28.720 --> 00:04:31.000
+it was introduced by the item editor in
+
+00:04:31.000 --> 00:04:33.040
+2018
+
+00:04:33.040 --> 00:04:35.680
+besides item is also being integrated
+
+00:04:35.680 --> 00:04:36.960
+into the neo-vim
+
+00:04:36.960 --> 00:04:41.040
+editor and github is using it to power
+
+00:04:41.040 --> 00:04:42.479
+their source code analysis and
+
+00:04:42.479 --> 00:04:45.840
+navigation features
+
+00:04:45.840 --> 00:04:48.639
+it is written in c and can be compiled
+
+00:04:48.639 --> 00:04:49.199
+for all
+
+00:04:49.199 --> 00:04:53.120
+major platforms it can even be compiled
+
+00:04:53.120 --> 00:04:56.080
+to web assembly to run on the web that's
+
+00:04:56.080 --> 00:04:57.600
+how github is using it
+
+00:04:57.600 --> 00:05:00.800
+on their website
+
+00:05:00.800 --> 00:05:02.960
+so why is trisita an interesting
+
+00:05:02.960 --> 00:05:05.840
+solution to this problem
+
+00:05:05.840 --> 00:05:07.360
+there are multiple features that make it
+
+00:05:07.360 --> 00:05:10.000
+an attractive option
+
+00:05:10.000 --> 00:05:12.400
+it is designed to be fast by being
+
+00:05:12.400 --> 00:05:13.680
+incremental
+
+00:05:13.680 --> 00:05:15.680
+the initial parts of a typical big fight
+
+00:05:15.680 --> 00:05:18.160
+can take tens of milliseconds
+
+00:05:18.160 --> 00:05:20.240
+while subsequent incremental processes
+
+00:05:20.240 --> 00:05:22.560
+are sub milliseconds
+
+00:05:22.560 --> 00:05:24.720
+it achieves this by using structural
+
+00:05:24.720 --> 00:05:26.240
+sharing
+
+00:05:26.240 --> 00:05:29.360
+meaning replacing only affected nodes
+
+00:05:29.360 --> 00:05:32.960
+in the old tree when it needs to
+
+00:05:32.960 --> 00:05:36.000
+also unlike lsp being in the same
+
+00:05:36.000 --> 00:05:37.120
+process
+
+00:05:37.120 --> 00:05:40.639
+it has much lower latency
+
+00:05:40.639 --> 00:05:42.880
+secondly it provides a uniform
+
+00:05:42.880 --> 00:05:44.960
+programming interface
+
+00:05:44.960 --> 00:05:47.039
+the same data structures and functions
+
+00:05:47.039 --> 00:05:48.720
+work on parse trees of different
+
+00:05:48.720 --> 00:05:50.400
+languages
+
+00:05:50.400 --> 00:05:52.160
+syntax knows of different languages
+
+00:05:52.160 --> 00:05:54.160
+differ only by their types
+
+00:05:54.160 --> 00:05:57.360
+and their possible child nodes this
+
+00:05:57.360 --> 00:05:58.960
+is a big advantage over language
+
+00:05:58.960 --> 00:06:02.240
+specific parcels
+
+00:06:02.240 --> 00:06:04.880
+thirdly it's written in self-contained
+
+00:06:04.880 --> 00:06:06.880
+embeddable c
+
+00:06:06.880 --> 00:06:09.680
+as I mentioned previously it can even be
+
+00:06:09.680 --> 00:06:10.400
+compiled
+
+00:06:10.400 --> 00:06:13.759
+to webassembly this makes integrating it
+
+00:06:13.759 --> 00:06:15.199
+into various editors
+
+00:06:15.199 --> 00:06:18.240
+quite easy without having to install
+
+00:06:18.240 --> 00:06:22.880
+any external dependencies
+
+00:06:22.880 --> 00:06:24.639
+one thing that is not mentioned here is
+
+00:06:24.639 --> 00:06:28.000
+that being a parcel generator
+
+00:06:28.000 --> 00:06:31.039
+scrummers are declarative
+
+00:06:31.039 --> 00:06:34.880
+together with being editor independent
+
+00:06:34.880 --> 00:06:36.720
+this makes the pool of potential
+
+00:06:36.720 --> 00:06:38.160
+contributors
+
+00:06:38.160 --> 00:06:42.400
+much larger so I was convinced
+
+00:06:42.400 --> 00:06:45.520
+that trisito is a good fit for Emacs
+
+00:06:45.520 --> 00:06:48.000
+last year I started writing the bindings
+
+00:06:48.000 --> 00:06:48.720
+using
+
+00:06:48.720 --> 00:06:50.960
+dynamic model support introduced in imax
+
+00:06:50.960 --> 00:06:53.280
+25.
+
+00:06:53.280 --> 00:06:55.360
+dynamic module means there is platform
+
+00:06:55.360 --> 00:06:58.479
+specific native code involved
+
+00:06:58.479 --> 00:07:00.560
+but since they are pre-compiled binaries
+
+00:07:00.560 --> 00:07:02.880
+for the three major platforms
+
+00:07:02.880 --> 00:07:06.319
+it should work in most places currently
+
+00:07:06.319 --> 00:07:08.319
+the core functionalities are in a pretty
+
+00:07:08.319 --> 00:07:09.440
+good shape
+
+00:07:09.440 --> 00:07:12.560
+syntax highlighting is working nicely
+
+00:07:12.560 --> 00:07:14.840
+the whole thing is split into three
+
+00:07:14.840 --> 00:07:16.080
+packages
+
+00:07:16.080 --> 00:07:17.759
+tree sitter is the main package that
+
+00:07:17.759 --> 00:07:20.319
+other packages should depend on
+
+00:07:20.319 --> 00:07:22.800
+tree system lens is the language bundle
+
+00:07:22.800 --> 00:07:24.000
+that includes support
+
+00:07:24.000 --> 00:07:27.199
+for most common languages
+
+00:07:27.199 --> 00:07:30.080
+and finally the core apis are in the
+
+00:07:30.080 --> 00:07:32.160
+package tsc
+
+00:07:32.160 --> 00:07:36.160
+which stands for trees the core
+
+00:07:36.160 --> 00:07:38.800
+it is the implicit dependency of the
+
+00:07:38.800 --> 00:07:43.520
+three-seater package
+
+00:07:43.520 --> 00:07:46.000
+the main package includes the miner mode
+
+00:07:46.000 --> 00:07:47.520
+3-seater mode
+
+00:07:47.520 --> 00:07:49.840
+this provides the base for other major
+
+00:07:49.840 --> 00:07:52.560
+or minor modes to build on
+
+00:07:52.560 --> 00:07:55.280
+using image change tracking hooks it
+
+00:07:55.280 --> 00:07:55.840
+enables
+
+00:07:55.840 --> 00:07:58.080
+incremental parsing and provides a
+
+00:07:58.080 --> 00:08:00.800
+syntax tree that is always up to date
+
+00:08:00.800 --> 00:08:04.080
+after any edits in a buffer
+
+00:08:04.080 --> 00:08:06.560
+there is also a basic debug mode that
+
+00:08:06.560 --> 00:08:10.080
+shows the parse tree in another buffer
+
+00:08:10.080 --> 00:08:13.360
+here is a quick demo
+
+00:08:13.360 --> 00:08:15.759
+here I mean an empty python buffer with
+
+00:08:15.759 --> 00:08:17.520
+three seater enabled
+
+00:08:17.520 --> 00:08:19.440
+I'm going to turn on the debug mode to
+
+00:08:19.440 --> 00:08:26.560
+see the parse tree
+
+00:08:26.560 --> 00:08:28.720
+since the buffer is empty there is only
+
+00:08:28.720 --> 00:08:30.639
+one node in the syntax tree the top
+
+00:08:30.639 --> 00:08:33.279
+level module node
+
+00:08:33.279 --> 00:09:11.040
+let's try typing some code
+
+00:09:11.040 --> 00:09:13.600
+as you can see as I type into the python
+
+00:09:13.600 --> 00:09:14.640
+buffer
+
+00:09:14.640 --> 00:09:19.120
+the syntax tree updates in real time
+
+00:09:19.120 --> 00:09:21.120
+the other minor mode included in the
+
+00:09:21.120 --> 00:09:23.279
+main package is 3-seater
+
+00:09:23.279 --> 00:09:26.640
+hl mode it overrides font-lock mode and
+
+00:09:26.640 --> 00:09:28.480
+provides its own set of phases
+
+00:09:28.480 --> 00:09:31.839
+and customization options it is query
+
+00:09:31.839 --> 00:09:32.800
+driven
+
+00:09:32.800 --> 00:09:35.200
+that means instead of regular
+
+00:09:35.200 --> 00:09:36.240
+expressions
+
+00:09:36.240 --> 00:09:38.720
+it uses a list like query language to
+
+00:09:38.720 --> 00:09:40.320
+map syntax notes
+
+00:09:40.320 --> 00:09:43.760
+to highlighting phrases I'm going to
+
+00:09:43.760 --> 00:09:45.760
+open a python file with small snippets
+
+00:09:45.760 --> 00:09:54.320
+that showcase syntax highlighting
+
+00:09:54.320 --> 00:09:55.920
+so this is the default highlighting
+
+00:09:55.920 --> 00:10:00.880
+provided by python mode
+
+00:10:00.880 --> 00:10:02.839
+this is the highlighting enabled by tree
+
+00:10:02.839 --> 00:10:04.640
+sitter
+
+00:10:04.640 --> 00:10:07.680
+as you can see string interpolation
+
+00:10:07.680 --> 00:10:11.680
+and decorators are highlighted correctly
+
+00:10:11.680 --> 00:10:17.440
+function calls are also highlighted
+
+00:10:17.440 --> 00:10:20.240
+you can also note that property
+
+00:10:20.240 --> 00:10:21.839
+assessors
+
+00:10:21.839 --> 00:10:24.640
+and property assignments are highlighted
+
+00:10:24.640 --> 00:10:27.440
+differently
+
+00:10:27.440 --> 00:10:29.360
+what I like the most about this is that
+
+00:10:29.360 --> 00:10:30.880
+new bindings are consistently
+
+00:10:30.880 --> 00:10:32.640
+highlighted
+
+00:10:32.640 --> 00:10:36.320
+this included local variable
+
+00:10:36.320 --> 00:10:39.760
+function parameters and property
+
+00:10:39.760 --> 00:10:45.760
+mutations
+
+00:10:45.760 --> 00:10:48.000
+before going through the three queries
+
+00:10:48.000 --> 00:10:49.279
+and the syntax highlighting
+
+00:10:49.279 --> 00:10:51.680
+customization options
+
+00:10:51.680 --> 00:10:53.760
+let's take a brief look at the core data
+
+00:10:53.760 --> 00:10:55.040
+structures and functions
+
+00:10:55.040 --> 00:10:58.079
+that tree sitter provides
+
+00:10:58.079 --> 00:10:59.839
+so parsing is done with the help of a
+
+00:10:59.839 --> 00:11:02.240
+generic parser object
+
+00:11:02.240 --> 00:11:04.160
+a single parser object can be used to
+
+00:11:04.160 --> 00:11:06.000
+pass different languages
+
+00:11:06.000 --> 00:11:08.320
+by sending different language objects to
+
+00:11:08.320 --> 00:11:09.279
+it
+
+00:11:09.279 --> 00:11:10.880
+the language objects themselves are
+
+00:11:10.880 --> 00:11:14.079
+loaded from shared libraries
+
+00:11:14.079 --> 00:11:16.079
+since three seater mode already handles
+
+00:11:16.079 --> 00:11:17.360
+the parsing part
+
+00:11:17.360 --> 00:11:19.440
+we will instead focus on the functions
+
+00:11:19.440 --> 00:11:20.800
+that inspect nodes
+
+00:11:20.800 --> 00:11:25.279
+and in the resulting path tree
+
+00:11:25.279 --> 00:11:27.200
+we can ask tree sitter what is the
+
+00:11:27.200 --> 00:11:44.240
+syntax node at point
+
+00:11:44.240 --> 00:11:47.200
+uh is it an opaque object so this is not
+
+00:11:47.200 --> 00:11:48.480
+very useful
+
+00:11:48.480 --> 00:12:03.760
+we can instead ask what is its type
+
+00:12:03.760 --> 00:12:06.560
+so his type is the symbol comparison
+
+00:12:06.560 --> 00:12:08.959
+operator
+
+00:12:08.959 --> 00:12:11.600
+trees there are two kinds of nodes
+
+00:12:11.600 --> 00:12:13.680
+anonymous nodes and named nodes
+
+00:12:13.680 --> 00:12:15.519
+anonymous nodes correspond to simple
+
+00:12:15.519 --> 00:12:17.040
+grammar elements
+
+00:12:17.040 --> 00:12:19.839
+like keywords operators punctuations and
+
+00:12:19.839 --> 00:12:21.279
+so on
+
+00:12:21.279 --> 00:12:24.160
+name nodes on the other hand grammar
+
+00:12:24.160 --> 00:12:25.920
+elements that are interesting enough for
+
+00:12:25.920 --> 00:12:26.639
+their own
+
+00:12:26.639 --> 00:12:30.320
+to have a name like an identifier an
+
+00:12:30.320 --> 00:12:31.839
+expression
+
+00:12:31.839 --> 00:12:35.440
+or a function definition
+
+00:12:35.440 --> 00:12:37.760
+name node types are symbols while
+
+00:12:37.760 --> 00:12:42.639
+anonymous node types are strings
+
+00:12:42.639 --> 00:12:46.320
+for example if we are on this
+
+00:12:46.320 --> 00:12:49.760
+comparison operator
+
+00:12:49.760 --> 00:12:55.920
+the node type should be a string
+
+00:12:55.920 --> 00:12:57.920
+we can also get other information about
+
+00:12:57.920 --> 00:12:58.959
+the node
+
+00:12:58.959 --> 00:13:09.680
+for example what is this text
+
+00:13:09.680 --> 00:13:20.800
+or where it is in the buffer
+
+00:13:20.800 --> 00:13:43.199
+or what is its parent
+
+00:13:43.199 --> 00:13:46.160
+there are many other apis to query or
+
+00:13:46.160 --> 00:13:46.839
+not
+
+00:13:46.839 --> 00:13:52.639
+properties
+
+00:13:52.639 --> 00:13:54.399
+tree sitter allows searching for
+
+00:13:54.399 --> 00:13:58.240
+structural patterns within a parse tree
+
+00:13:58.240 --> 00:14:01.440
+it does so through a list like language
+
+00:14:01.440 --> 00:14:03.519
+this language supports by the matching
+
+00:14:03.519 --> 00:14:04.639
+by node types
+
+00:14:04.639 --> 00:14:07.760
+field names and predicates
+
+00:14:07.760 --> 00:14:10.079
+it also allows capturing nodes for
+
+00:14:10.079 --> 00:14:12.639
+further processing
+
+00:14:12.639 --> 00:14:37.680
+let's try to see some examples
+
+00:14:37.680 --> 00:14:41.040
+so in this very simple query we just
+
+00:14:41.040 --> 00:14:43.839
+try to highlight all the identifiers in
+
+00:14:43.839 --> 00:14:49.040
+the buffer
+
+00:14:49.040 --> 00:14:51.920
+this s side tells trisito to capture a
+
+00:14:51.920 --> 00:14:53.120
+node
+
+00:14:53.120 --> 00:14:55.839
+in the context of the query builder it's
+
+00:14:55.839 --> 00:14:57.360
+not very important
+
+00:14:57.360 --> 00:15:00.320
+but in normal highlighting query this
+
+00:15:00.320 --> 00:15:01.760
+will determine
+
+00:15:01.760 --> 00:15:06.639
+the face used to highlight the note
+
+00:15:06.639 --> 00:15:08.800
+suppose we want to capture all the
+
+00:15:08.800 --> 00:15:10.320
+function names
+
+00:15:10.320 --> 00:15:13.519
+instead of just any identifier
+
+00:15:13.519 --> 00:15:29.440
+you can improve the query like this
+
+00:15:29.440 --> 00:15:31.600
+uh this will highlight the whole
+
+00:15:31.600 --> 00:15:32.639
+definition
+
+00:15:32.639 --> 00:15:35.519
+but we only want to capture the function
+
+00:15:35.519 --> 00:15:36.399
+name
+
+00:15:36.399 --> 00:15:39.600
+which means the identifier
+
+00:15:39.600 --> 00:15:42.800
+here so we
+
+00:15:42.800 --> 00:15:46.320
+move the capture to after the identifier
+
+00:15:46.320 --> 00:15:49.600
+node
+
+00:15:49.600 --> 00:15:51.759
+if we want to capture the class names as
+
+00:15:51.759 --> 00:15:52.959
+well
+
+00:15:52.959 --> 00:16:10.079
+we just add another pattern
+
+00:16:10.079 --> 00:16:20.320
+let's look at a more practical example
+
+00:16:20.320 --> 00:16:22.959
+here we can see that single quotes
+
+00:16:22.959 --> 00:16:23.759
+strings and
+
+00:16:23.759 --> 00:16:25.600
+double quotes screens are highlighted
+
+00:16:25.600 --> 00:16:27.279
+the same
+
+00:16:27.279 --> 00:16:30.399
+but in some places
+
+00:16:30.399 --> 00:16:33.440
+because of some coding conventions
+
+00:16:33.440 --> 00:16:35.440
+it may be desirable to highlight them
+
+00:16:35.440 --> 00:16:37.279
+differently for example if
+
+00:16:37.279 --> 00:16:39.680
+the string is single quoted we may want
+
+00:16:39.680 --> 00:16:40.880
+to highlight it
+
+00:16:40.880 --> 00:16:44.399
+as a constant
+
+00:16:44.399 --> 00:16:46.160
+let's try to see whether we can
+
+00:16:46.160 --> 00:16:47.600
+distinguish these
+
+00:16:47.600 --> 00:16:56.240
+two cases
+
+00:16:56.240 --> 00:17:00.639
+so here we get all the strings
+
+00:17:00.639 --> 00:17:04.079
+if we want to see if it's single quotes
+
+00:17:04.079 --> 00:17:04.559
+or
+
+00:17:04.559 --> 00:17:08.799
+double quote strings
+
+00:17:08.799 --> 00:17:11.039
+we can try looking at the first
+
+00:17:11.039 --> 00:17:12.480
+character
+
+00:17:12.480 --> 00:17:15.280
+of the string I mean the first character
+
+00:17:15.280 --> 00:17:16.720
+of the note
+
+00:17:16.720 --> 00:17:19.360
+to check whether it's a single quote or
+
+00:17:19.360 --> 00:17:33.600
+a double quote
+
+00:17:33.600 --> 00:17:36.080
+yeah so for that we use the three
+
+00:17:36.080 --> 00:17:36.799
+setters
+
+00:17:36.799 --> 00:17:40.160
+support for predicate in this case
+
+00:17:40.160 --> 00:17:43.360
+we use a match predicate
+
+00:17:43.360 --> 00:17:46.080
+to check whether the string where the
+
+00:17:46.080 --> 00:17:46.799
+note
+
+00:17:46.799 --> 00:17:50.320
+starts with a single quote and with this
+
+00:17:50.320 --> 00:17:51.280
+pattern
+
+00:17:51.280 --> 00:17:58.840
+we only capture the single quotes
+
+00:17:58.840 --> 00:18:00.400
+strings
+
+00:18:00.400 --> 00:18:03.760
+let's try to give it a different face
+
+00:18:03.760 --> 00:18:13.039
+so we copy the pattern
+
+00:18:13.039 --> 00:18:18.640
+and we add this pattern
+
+00:18:18.640 --> 00:18:25.120
+pop item only
+
+00:18:25.120 --> 00:18:28.400
+but we also want to give the
+
+00:18:28.400 --> 00:18:31.440
+capture a different name
+
+00:18:31.440 --> 00:18:40.840
+let's say we want to highlight it as a
+
+00:18:40.840 --> 00:18:46.559
+keyword
+
+00:18:46.559 --> 00:19:06.320
+and now if we refresh the buffer
+
+00:19:06.320 --> 00:19:08.799
+we see that single quote strings are
+
+00:19:08.799 --> 00:19:10.320
+highlighted as
+
+00:19:10.320 --> 00:19:14.400
+keywords
+
+00:19:14.400 --> 00:19:16.400
+the highlighting patterns can also be
+
+00:19:16.400 --> 00:19:19.200
+set for a single project
+
+00:19:19.200 --> 00:19:23.440
+using directory local variable
+
+00:19:23.440 --> 00:19:26.880
+for example let's take a look at
+
+00:19:26.880 --> 00:19:35.760
+ems source code
+
+00:19:35.760 --> 00:19:40.400
+so in image c source there are a lot of
+
+00:19:40.400 --> 00:19:43.760
+uses of these different macros
+
+00:19:43.760 --> 00:19:47.679
+to define functions
+
+00:19:47.679 --> 00:19:51.200
+and you can see
+
+00:19:51.200 --> 00:19:53.520
+this is actually the function name but
+
+00:19:53.520 --> 00:19:55.760
+it's highlighted as the
+
+00:19:55.760 --> 00:19:59.120
+string so what we want
+
+00:19:59.120 --> 00:20:03.679
+is to somehow recognize this pattern
+
+00:20:03.679 --> 00:20:07.600
+and highlight it
+
+00:20:07.600 --> 00:20:11.280
+as highlight this part
+
+00:20:11.280 --> 00:20:14.559
+with the function phase instead
+
+00:20:14.559 --> 00:20:17.679
+in order to do that
+
+00:20:17.679 --> 00:20:20.240
+we put a pattern in this project
+
+00:20:20.240 --> 00:20:21.760
+directory local
+
+00:20:21.760 --> 00:20:31.760
+settings file
+
+00:20:31.760 --> 00:20:34.799
+so we can put this button in the c
+
+00:20:34.799 --> 00:20:40.159
+mode section
+
+00:20:40.159 --> 00:20:48.000
+and now if we enable tree sitter
+
+00:20:48.000 --> 00:20:50.480
+you can see that this is the highlighted
+
+00:20:50.480 --> 00:20:53.200
+uh
+
+00:20:53.200 --> 00:20:55.520
+as a normal function definition so this
+
+00:20:55.520 --> 00:20:56.559
+is the function
+
+00:20:56.559 --> 00:21:01.200
+face like we wanted
+
+00:21:01.200 --> 00:21:03.760
+the pattern for this is actually pretty
+
+00:21:03.760 --> 00:21:07.200
+simple
+
+00:21:07.200 --> 00:21:10.720
+it's only
+
+00:21:10.720 --> 00:21:14.720
+only this part so
+
+00:21:14.720 --> 00:21:17.440
+if it's a function call where the name
+
+00:21:17.440 --> 00:21:19.679
+of the function is different
+
+00:21:19.679 --> 00:21:21.600
+then we highlight the different as a
+
+00:21:21.600 --> 00:21:24.240
+keyword
+
+00:21:24.240 --> 00:21:27.360
+and then the first string element we
+
+00:21:27.360 --> 00:21:28.159
+highlighted
+
+00:21:28.159 --> 00:21:35.360
+as a function name
+
+00:21:35.360 --> 00:21:37.679
+since the language objects are actually
+
+00:21:37.679 --> 00:21:39.280
+native code
+
+00:21:39.280 --> 00:21:40.799
+they have to be compiled for each
+
+00:21:40.799 --> 00:21:43.440
+platform that we want to support
+
+00:21:43.440 --> 00:21:45.600
+this will become a big obstacle for
+
+00:21:45.600 --> 00:21:48.159
+3-seater adoption
+
+00:21:48.159 --> 00:21:50.240
+therefore I've created a language window
+
+00:21:50.240 --> 00:21:52.960
+package 3-seater length
+
+00:21:52.960 --> 00:21:54.960
+that takes care of pre-compiling the
+
+00:21:54.960 --> 00:21:56.320
+grammars the
+
+00:21:56.320 --> 00:21:59.679
+most common grammars for all three major
+
+00:21:59.679 --> 00:22:01.600
+platforms
+
+00:22:01.600 --> 00:22:04.080
+it also takes care of distributing these
+
+00:22:04.080 --> 00:22:05.360
+binaries
+
+00:22:05.360 --> 00:22:08.080
+and provides some highlighting queries
+
+00:22:08.080 --> 00:22:11.440
+for some of the languages
+
+00:22:11.440 --> 00:22:13.760
+it should be noted that this package
+
+00:22:13.760 --> 00:22:15.919
+should be treated as a temporary
+
+00:22:15.919 --> 00:22:19.919
+distribution mechanism only
+
+00:22:19.919 --> 00:22:22.240
+to help with bootstrapping three-seaters
+
+00:22:22.240 --> 00:22:24.720
+adoption
+
+00:22:24.720 --> 00:22:27.760
+the plan is that eventually these files
+
+00:22:27.760 --> 00:22:29.760
+should be provided by the language major
+
+00:22:29.760 --> 00:22:32.480
+modes themselves
+
+00:22:32.480 --> 00:22:35.120
+but in order to do that we need better
+
+00:22:35.120 --> 00:22:36.320
+tooling
+
+00:22:36.320 --> 00:22:40.240
+so we're not there yet
+
+00:22:40.240 --> 00:22:42.559
+since the call already works reasonably
+
+00:22:42.559 --> 00:22:43.280
+well
+
+00:22:43.280 --> 00:22:44.640
+there are several areas that would
+
+00:22:44.640 --> 00:22:46.320
+benefit from the community's
+
+00:22:46.320 --> 00:22:49.120
+contribution
+
+00:22:49.120 --> 00:22:51.520
+so three seaters upstream language
+
+00:22:51.520 --> 00:22:52.640
+prepositories
+
+00:22:52.640 --> 00:22:54.400
+already contain highlighting queries on
+
+00:22:54.400 --> 00:22:55.679
+their own
+
+00:22:55.679 --> 00:22:58.480
+however they are pretty basic and they
+
+00:22:58.480 --> 00:23:00.480
+may not fit well with existing emax
+
+00:23:00.480 --> 00:23:02.559
+conventions
+
+00:23:02.559 --> 00:23:04.320
+therefore the language bundle has its
+
+00:23:04.320 --> 00:23:07.120
+own set of highlighting queries
+
+00:23:07.120 --> 00:23:10.559
+this requires maintenance until language
+
+00:23:10.559 --> 00:23:11.600
+measurements adopt
+
+00:23:11.600 --> 00:23:13.760
+three sitter and maintain the queries on
+
+00:23:13.760 --> 00:23:16.640
+their own
+
+00:23:16.640 --> 00:23:18.480
+the queries are actually quite easy to
+
+00:23:18.480 --> 00:23:22.000
+write as you've already seen
+
+00:23:22.000 --> 00:23:24.240
+you just need to be familiar with the
+
+00:23:24.240 --> 00:23:25.360
+language
+
+00:23:25.360 --> 00:23:30.000
+familiar enough to come up with sensible
+
+00:23:30.000 --> 00:23:35.200
+highlighting patterns
+
+00:23:35.200 --> 00:23:37.600
+and if you are a maintainer of a
+
+00:23:37.600 --> 00:23:39.679
+language major mode
+
+00:23:39.679 --> 00:23:42.320
+you may want to consider integrating
+
+00:23:42.320 --> 00:23:43.360
+tree sitter into
+
+00:23:43.360 --> 00:23:46.960
+your mode initially maybe as an
+
+00:23:46.960 --> 00:23:50.080
+optional feature the integration is
+
+00:23:50.080 --> 00:23:53.279
+actually pretty straightforward
+
+00:23:53.279 --> 00:23:56.640
+especially for syntax highlighting
+
+00:23:56.640 --> 00:24:01.520
+or alternatively
+
+00:24:01.520 --> 00:24:03.760
+you can also try writing a new major
+
+00:24:03.760 --> 00:24:04.640
+mode
+
+00:24:04.640 --> 00:24:08.000
+from scratch that relies on tree sitter
+
+00:24:08.000 --> 00:24:12.559
+from the very beginning
+
+00:24:12.559 --> 00:24:16.320
+the code for such a major mode is
+
+00:24:16.320 --> 00:24:19.679
+quite simple for example
+
+00:24:19.679 --> 00:24:23.200
+this is the proposed
+
+00:24:23.200 --> 00:24:26.240
+what mode for web assembly
+
+00:24:26.240 --> 00:24:31.039
+the code is just
+
+00:24:31.039 --> 00:24:34.559
+like one page of code not
+
+00:24:34.559 --> 00:24:39.520
+not a lot
+
+00:24:39.520 --> 00:24:42.720
+you can also try writing new minor modes
+
+00:24:42.720 --> 00:24:46.559
+or writing integration packages
+
+00:24:46.559 --> 00:24:50.080
+for example a lot of package a lot of
+
+00:24:50.080 --> 00:24:50.880
+packages
+
+00:24:50.880 --> 00:24:54.559
+may benefit from tree sitter integration
+
+00:24:54.559 --> 00:24:58.840
+but no one has written the integration
+
+00:24:58.840 --> 00:25:02.960
+yet
+
+00:25:02.960 --> 00:25:05.039
+if you are interested in 3-seater you
+
+00:25:05.039 --> 00:25:06.720
+can use these links to
+
+00:25:06.720 --> 00:25:10.320
+learn more about it I think that's it
+
+00:25:10.320 --> 00:25:11.440
+for me today
+
+00:25:11.440 --> 00:25:18.159
+I'm happy to answer any questions