From 4f5b5ed84ef1ce98bfc820d3e3cc9ccd9762e9e6 Mon Sep 17 00:00:00 2001 From: Sacha Chua Date: Sat, 3 Dec 2022 12:52:35 -0500 Subject: add captions --- ...eyond-syntax-highlighting--abin-simon--main.vtt | 727 +++++++++++++++++++++ 1 file changed, 727 insertions(+) create mode 100644 2022/captions/emacsconf-2022-treesitter--treesitter-beyond-syntax-highlighting--abin-simon--main.vtt (limited to '2022/captions/emacsconf-2022-treesitter--treesitter-beyond-syntax-highlighting--abin-simon--main.vtt') diff --git a/2022/captions/emacsconf-2022-treesitter--treesitter-beyond-syntax-highlighting--abin-simon--main.vtt b/2022/captions/emacsconf-2022-treesitter--treesitter-beyond-syntax-highlighting--abin-simon--main.vtt new file mode 100644 index 00000000..8a426e7c --- /dev/null +++ b/2022/captions/emacsconf-2022-treesitter--treesitter-beyond-syntax-highlighting--abin-simon--main.vtt @@ -0,0 +1,727 @@ +WEBVTT captioned by sachac + +00:00:00.000 --> 00:00:03.240 +Hey everyone, my name is Abin Simon + +00:00:03.240 --> 00:00:05.080 +and this talk is about "Tree-sitter: + +00:00:05.080 --> 00:00:08.200 +Beyond Syntax Highlighting." + +00:00:08.200 --> 00:00:10.720 +For those who are not aware of what Tree-sitter is, + +00:00:10.720 --> 00:00:11.720 +let me give you a quick intro. + +00:00:11.720 --> 00:00:17.120 +Tree-sitter, at its core, is a parser generator tool + +00:00:17.120 --> 00:00:19.440 +and an incremental parsing library. + +00:00:19.440 --> 00:00:22.000 +What it essentially means is that it gives you + +00:00:22.000 --> 00:00:23.154 +an always up-to-date + +00:00:23.155 --> 00:00:24.200 +AST [abstract syntax tree] of your code. + +00:00:24.200 --> 00:00:27.960 +In the current Emacs frame, what you see to the right + +00:00:27.960 --> 00:00:30.840 +is the AST tree produced by Tree-sitter + +00:00:30.840 --> 00:00:33.560 +of the code that is on the left. + +00:00:33.560 --> 00:00:37.000 +For example, if you go to this "if" statement, + +00:00:37.000 --> 00:00:38.840 +you can see it goes here. + +00:00:38.840 --> 00:00:41.440 +It is also really good at handling errors. + +00:00:41.440 --> 00:00:44.400 +For example, if I were to delete this [if statement], + +00:00:44.400 --> 00:00:47.960 +it still parses out a tree as much as it can, + +00:00:47.960 --> 00:00:50.280 +but with an error node. + +00:00:50.280 --> 00:00:51.760 +Now let's see how we can query the tree + +00:00:51.760 --> 00:00:54.440 +to get the information that we need. + +00:00:54.440 --> 00:01:01.480 +Let's first try to get all the identifiers in the buffer. + +00:01:01.480 --> 00:01:04.000 +It highlights all the identifiers in the buffer, + +00:01:04.000 --> 00:01:05.440 +but let's say we want to get something + +00:01:05.440 --> 00:01:07.280 +a little more precise. + +00:01:07.280 --> 00:01:10.400 +Let's say we wanted to get this "i" here. + +00:01:10.400 --> 00:01:13.280 +This, in our case, would be this identifier + +00:01:13.280 --> 00:01:15.200 +inside this assignment expression + +00:01:15.200 --> 00:01:27.320 +inside this "for" statement. + +00:01:27.320 --> 00:01:29.920 +We can write it out like this. + +00:01:29.920 --> 00:01:31.880 +I hope this gives you a basic idea + +00:01:31.880 --> 00:01:34.480 +of how Tree-sitter works and how you can query + +00:01:34.480 --> 00:01:37.040 +to get the information that you need. + +00:01:37.040 --> 00:01:39.520 +First of all, let's see how Tree-sitter can help us + +00:01:39.520 --> 00:01:41.880 +with syntax highlighting. + +00:01:41.880 --> 00:01:46.480 +This is the default syntax highlighting by Emacs for SQL. + +00:01:46.480 --> 00:01:52.000 +Now let's see how Tree-sitter helps. + +00:01:52.000 --> 00:01:54.240 +This is the syntax highlighting in Emacs + +00:01:54.240 --> 00:01:56.760 +which Tree-sitter enabled. + +00:01:56.760 --> 00:01:58.240 +You'll see that we're able to target + +00:01:58.240 --> 00:02:01.240 +a lot more things and highlight them. + +00:02:01.240 --> 00:02:03.138 +That said, you don't always have to + +00:02:03.139 --> 00:02:04.200 +highlight everything. + +00:02:04.200 --> 00:02:15.640 +I personally prefer a much simpler theme. + +00:02:15.640 --> 00:02:17.880 +Now let's see how Tree-sitter helps you simplify + +00:02:17.880 --> 00:02:20.920 +adding custom syntax highlighting to your code. + +00:02:20.920 --> 00:02:22.200 +This is a Python file which has + +00:02:22.200 --> 00:02:25.640 +a class and a few member functions. + +00:02:25.640 --> 00:02:27.680 +Anyone who has used Python will know that + +00:02:27.680 --> 00:02:32.040 +the "self" keyword, while it is passed in as an argument, + +00:02:32.040 --> 00:02:34.240 +it has more meaning than that. + +00:02:34.240 --> 00:02:35.480 +Let's see if you can use Tree-sitter + +00:02:35.480 --> 00:02:38.720 +to highlight just the "self" keyword. + +00:02:38.720 --> 00:02:40.400 +If you look at the Tree-sitter tree, + +00:02:40.400 --> 00:02:43.120 +you can see that this is the first identifier + +00:02:43.120 --> 00:02:45.520 +in the list of parameters for a function definition. + +00:02:45.520 --> 00:02:55.480 +This is how you would query for the first identifier + +00:02:55.480 --> 00:02:59.320 +inside parameters inside a function definition. + +00:02:59.320 --> 00:03:02.520 +Now, if you see here, it also matches "cls", + +00:03:02.520 --> 00:03:11.360 +but let's restrict it to match just "self". + +00:03:11.360 --> 00:03:14.200 +Now we have a Tree-sitter query that identifies + +00:03:14.200 --> 00:03:16.960 +the first argument to the function definition + +00:03:16.960 --> 00:03:19.640 +and is also called "self". + +00:03:19.640 --> 00:03:22.520 +We can use this to apply custom highlighting onto this. + +00:03:22.520 --> 00:03:25.000 +This is pretty much all the code + +00:03:25.000 --> 00:03:26.520 +that you'll need to do this. + +00:03:26.520 --> 00:03:29.240 +The first block here is essentially to say to + +00:03:29.240 --> 00:03:32.160 +Tree-sitter to highlight anything with python.self + +00:03:32.160 --> 00:03:35.720 +with the face of custom-set. + +00:03:35.720 --> 00:03:37.520 +Now the second block here essentially is + +00:03:37.520 --> 00:03:39.800 +how we match for that. + +00:03:39.800 --> 00:03:41.800 +Now if you go back into a Python buffer + +00:03:41.800 --> 00:03:44.680 +and re-enable python-mode, we'll see that "self" + +00:03:44.680 --> 00:03:47.120 +is highlighted differently. + +00:03:47.120 --> 00:03:48.880 +How about creating text objects? + +00:03:48.880 --> 00:03:50.440 +Tree-sitter can help there too. + +00:03:50.440 --> 00:03:53.080 +For those who don't know, text objects + +00:03:53.080 --> 00:03:54.440 +is an idea that comes from Vim, + +00:03:54.440 --> 00:03:57.760 +and you can do things like select word, + +00:03:57.760 --> 00:04:00.520 +delete word, things like that. + +00:04:00.520 --> 00:04:06.200 +There are other text objects like line and paragraph. + +00:04:06.200 --> 00:04:09.000 +For each text object, you can have operations + +00:04:09.000 --> 00:04:09.760 +that are defined on them. + +00:04:09.760 --> 00:04:13.600 +For example, delete, copy, select, comment, + +00:04:13.600 --> 00:04:16.400 +all of these are operations that you can do. + +00:04:16.400 --> 00:04:19.400 +Let's try and use Tree-sitter to add more text objects. + +00:04:19.400 --> 00:04:20.560 +This is a plugin that I wrote + +00:04:20.560 --> 00:04:25.000 +which lets you add more text objects into Emacs. + +00:04:25.000 --> 00:04:27.880 +It helps you code aware text objects + +00:04:27.880 --> 00:04:31.880 +like functions, conditionals, loops, and such. + +00:04:31.880 --> 00:04:34.360 +Let's see an example scenario of how + +00:04:34.360 --> 00:04:35.920 +something like this could come in handy. + +00:04:35.920 --> 00:04:39.280 +For example, I can select inside this condition + +00:04:39.280 --> 00:04:42.960 +or inside this function and do things like that. + +00:04:42.960 --> 00:04:44.520 +Let's say I want to take this conditional, + +00:04:44.520 --> 00:04:47.160 +move to the next function, and create it here. + +00:04:47.160 --> 00:04:49.640 +What I would do is something like + +00:04:49.640 --> 00:04:52.320 +delete the conditional, move to the next function, + +00:04:52.320 --> 00:04:56.240 +create a conditional there, and paste. + +00:04:56.240 --> 00:04:57.160 +Let's try another example. + +00:04:57.160 --> 00:05:01.360 +Let's say I want to take this and move it to the end. + +00:05:01.360 --> 00:05:02.960 +If I had to do it without text objects, + +00:05:02.960 --> 00:05:06.800 +I'd probably have to go back to the previous comma, + +00:05:06.800 --> 00:05:10.440 +delete till next comma, find the closing bracket, + +00:05:10.440 --> 00:05:11.880 +and paste before. + +00:05:11.880 --> 00:05:14.040 +That works, but let's see + +00:05:14.040 --> 00:05:16.520 +how Tree-sitter can simplify it. + +00:05:16.520 --> 00:05:19.240 +With Tree-sitter, I can say delete the argument, + +00:05:19.240 --> 00:05:22.880 +go to the end of the next argument, and then paste. + +00:05:22.880 --> 00:05:25.280 +Tree-sitter essentially helps Emacs + +00:05:25.280 --> 00:05:27.240 +understand the code better semantically. + +00:05:27.240 --> 00:05:29.600 +Here is yet another use case. + +00:05:29.600 --> 00:05:31.480 +I work at a remote company, + +00:05:31.480 --> 00:05:33.440 +and I often find myself being in a call + +00:05:33.440 --> 00:05:35.400 +with my teammates, explaining the code to them. + +00:05:35.400 --> 00:05:38.000 +And one thing that really comes in handy + +00:05:38.000 --> 00:05:39.760 +is the narrowing accessibility of Emacs. + +00:05:39.760 --> 00:05:43.040 +Specifically, the fancy-narrow package. + +00:05:43.040 --> 00:05:44.840 +I use it to narrow just the function, + +00:05:44.840 --> 00:05:48.760 +or I could narrow to the conditional. + +00:05:48.760 --> 00:05:51.520 +Next to the end, the list would be code folding. + +00:05:51.520 --> 00:05:54.480 +This is a package which uses Tree-sitter + +00:05:54.480 --> 00:05:57.560 +to improve the code folding functionalities of Emacs. + +00:05:57.560 --> 00:06:00.200 +Code folding has always been this thing + +00:06:00.200 --> 00:06:02.280 +that I've had a love-hate relationship with. + +00:06:02.280 --> 00:06:04.280 +It usually works most of the time, + +00:06:04.280 --> 00:06:06.960 +but then fails if the indentation is wrong + +00:06:06.960 --> 00:06:09.160 +or we do something weird with the arguments. + +00:06:09.160 --> 00:06:11.680 +But now with Tree-sitter in the mix, + +00:06:11.680 --> 00:06:12.720 +it's a lot more precise. + +00:06:12.720 --> 00:06:17.040 +I can fold comments, I can fold functions, + +00:06:17.040 --> 00:06:20.480 +I can fold conditionals. You get the idea. + +00:06:20.480 --> 00:06:23.840 +I work with Kubernetes, which means I end up + +00:06:23.840 --> 00:06:28.080 +having to write and read a lot of YAML files. + +00:06:28.080 --> 00:06:31.840 +And navigating big YAML files is a mess. + +00:06:31.840 --> 00:06:35.760 +The two main problems are figuring out where I am, + +00:06:35.760 --> 00:06:38.760 +and two, navigating to where I want to be. + +00:06:38.760 --> 00:06:41.760 +Let's see how Tree-sitter can help us with both of this. + +00:06:41.760 --> 00:06:43.840 +This is an example YAML file. + +00:06:43.840 --> 00:06:47.080 +To be precise, this is the values file + +00:06:47.080 --> 00:06:48.640 +of the Redis helm chart. + +00:06:48.640 --> 00:06:52.240 +I'm somewhere in the file on tag under image, + +00:06:52.240 --> 00:06:54.880 +but I don't know what this tag is for. + +00:06:54.880 --> 00:06:57.240 +But with the help of Tree-sitter, + +00:06:57.240 --> 00:06:59.160 +I've been able to add this information + +00:06:59.160 --> 00:07:00.440 +into my header line. + +00:07:00.440 --> 00:07:02.960 +If you see in the header line, + +00:07:02.960 --> 00:07:05.880 +you'll see that I'm under sentinel.image. + +00:07:05.880 --> 00:07:08.800 +Now let's see how this helps with navigation. + +00:07:08.800 --> 00:07:12.680 +Let's say I want to enable persistence on master node. + +00:07:12.680 --> 00:07:18.200 +So with the help of Tree-sitter, + +00:07:18.200 --> 00:07:20.400 +I was able to enumerate every field + +00:07:20.400 --> 00:07:22.200 +that is available in this YAML file, + +00:07:22.200 --> 00:07:24.520 +and I can pass that information onto imenu, + +00:07:24.520 --> 00:07:28.040 +which I can then use to go to exactly where I want to. + +00:07:28.040 --> 00:07:30.000 +Also, since we're not dealing with + +00:07:30.000 --> 00:07:32.600 +any language specific constructs, + +00:07:32.600 --> 00:07:34.040 +this is very easy to extend to + +00:07:34.040 --> 00:07:35.760 +other similar languages + +00:07:35.760 --> 00:07:37.440 +or config files in this case. + +00:07:37.440 --> 00:07:39.520 +So for example, this is a JSON file, + +00:07:39.520 --> 00:07:44.800 +and I can navigate to location or project. + +00:07:44.800 --> 00:07:48.320 +And just like in YAML, it shows me where I'm at. + +00:07:48.320 --> 00:07:49.920 +I'm in projects.name, + +00:07:49.920 --> 00:07:52.880 +or I'm inside projects.highlights. + +00:07:52.880 --> 00:07:55.600 +Or how about Nix? + +00:07:55.600 --> 00:07:57.480 +This is my home.nix file. + +00:07:57.480 --> 00:08:01.040 +Again, I can search for services, + +00:08:01.040 --> 00:08:04.640 +and this lists me all the services that I've enabled. + +00:08:04.640 --> 00:08:06.720 +How about just services.description? + +00:08:06.720 --> 00:08:08.160 +So this is all the services + +00:08:08.160 --> 00:08:10.480 +that I've enabled and have descriptions. + +00:08:10.480 --> 00:08:12.720 +Now that we have seen this for config files, + +00:08:12.720 --> 00:08:15.040 +let's see how similar things apply for code. + +00:08:15.040 --> 00:08:16.760 +Just like in config files, + +00:08:16.760 --> 00:08:18.680 +I can see which function I'm under, + +00:08:18.680 --> 00:08:21.560 +and if I go to the next function, it changes. + +00:08:21.560 --> 00:08:23.960 +Okay, here is something really awesome. + +00:08:23.960 --> 00:08:26.600 +This is probably one of my favorites, + +00:08:26.600 --> 00:08:30.400 +and one of the things that actually made me understand + +00:08:30.400 --> 00:08:34.080 +how powerful Tree-sitter is, and got me into it. + +00:08:34.080 --> 00:08:35.680 +I work with a lot of Go code, + +00:08:35.680 --> 00:08:38.840 +and anyone who has worked with Go will tell you + +00:08:38.840 --> 00:08:41.040 +how repetitive it is handling errors. + +00:08:41.040 --> 00:08:42.800 +For those who don't write Go, + +00:08:42.800 --> 00:08:45.200 +let me give you a rough idea of what I'm talking about. + +00:08:45.200 --> 00:08:47.000 +If you want to bubble up the error, + +00:08:47.000 --> 00:08:49.920 +the way you would do it is just to return the error + +00:08:49.920 --> 00:08:51.400 +to the function that called it. + +00:08:51.400 --> 00:08:55.720 +Over here, you can either return nil or an empty value, + +00:08:55.720 --> 00:08:57.640 +and at the end, you return error. + +00:08:57.640 --> 00:09:00.200 +Let's try and use Tree-sitter to do this. + +00:09:00.200 --> 00:09:03.120 +Using the help of Tree-sitter, let's make Emacs + +00:09:03.120 --> 00:09:06.421 +go back, figure out what the return arguments are, + +00:09:06.422 --> 00:09:08.240 +figure out what their default values are, + +00:09:08.240 --> 00:09:11.480 +and automatically fill in the return statement. + +00:09:11.480 --> 00:09:13.040 +It would look something like this. + +00:09:13.040 --> 00:09:16.120 +In my case, it filled in the complete form, + +00:09:16.120 --> 00:09:18.320 +it figured out what the return arguments are, + +00:09:18.320 --> 00:09:19.320 +what their types are, + +00:09:19.320 --> 00:09:20.960 +and what their default values are, + +00:09:20.960 --> 00:09:22.800 +and filled out the entire return. + +00:09:22.800 --> 00:09:24.760 +And since this is a template, + +00:09:24.760 --> 00:09:27.720 +I can go to the next function, do the same thing, + +00:09:27.720 --> 00:09:29.560 +next function, do the same thing, + +00:09:29.560 --> 00:09:31.520 +next function, do the same thing. + +00:09:31.520 --> 00:09:34.360 +Here is a really fascinating use case of Tree-sitter, + +00:09:34.360 --> 00:09:36.320 +structural editing. + +00:09:36.320 --> 00:09:38.200 +You might be aware of plugins like paredit, + +00:09:38.200 --> 00:09:40.280 +which seems to "know" your code. + +00:09:40.280 --> 00:09:42.520 +This sort of takes it onto another level. + +00:09:42.520 --> 00:09:46.040 +It is in its early stages, but what this lets you do + +00:09:46.040 --> 00:09:48.920 +is completely treat your code as an AST, + +00:09:48.920 --> 00:09:52.000 +and edit as if it's a tree instead of characters. + +00:09:52.000 --> 00:09:54.640 +I am not going to go much in depth into it, + +00:09:54.640 --> 00:09:57.000 +but if you're interested, there is a talk + +00:09:57.000 --> 00:09:59.080 +from last year's EmacsConf around it. + +00:09:59.080 --> 00:10:02.320 +I'm just going to end this with one last tiny thing + +00:10:02.320 --> 00:10:04.920 +that I found in the tree-sitter-extras package. + +00:10:04.920 --> 00:10:07.600 +It's this tiny macro called tree-sitter-save-excursion. + +00:10:07.600 --> 00:10:11.240 +It works pretty much like save-excursion, but better. + +00:10:11.240 --> 00:10:13.400 +It uses the Tree-sitter syntax tree + +00:10:13.400 --> 00:10:14.800 +instead of just the code + +00:10:14.800 --> 00:10:16.720 +to figure out where to restore the position. + +00:10:16.720 --> 00:10:20.200 +My main use case for this was with code formatters. + +00:10:20.200 --> 00:10:22.080 +Since the code moves around a lot + +00:10:22.080 --> 00:10:23.160 +when it gets formatted, + +00:10:23.160 --> 00:10:25.000 +save-excursion was completely useless, + +00:10:25.000 --> 00:10:26.240 +but this came in handy. + +00:10:26.240 --> 00:10:28.120 +I'll just leave you off with + +00:10:28.120 --> 00:10:31.120 +what the future of Tree-sitter looks like for Emacs. + +00:10:31.120 --> 00:10:33.760 +So far, every Tree-sitter related feature + +00:10:33.760 --> 00:10:36.040 +that I've talked about is powered by this library. + +00:10:36.040 --> 00:10:42.320 +But there is talk about Tree-sitter coming into the core. + +00:10:42.320 --> 00:10:45.840 +It will most probably be landing in Emacs 29, + +00:10:45.840 --> 00:10:48.720 +and if you want to check out the work on Tree-sitter + +00:10:48.720 --> 00:10:51.200 +in core Emacs, you can check out + +00:10:51.200 --> 00:10:52.920 +the features/tree-sitter branch. + +00:10:52.920 --> 00:10:56.640 +You'll probably see more and more features and packages + +00:10:56.640 --> 00:10:59.640 +relying upon Tree-sitter, and even major modes + +00:10:59.640 --> 00:11:01.560 +being powered by Tree-sitter. + +00:11:01.560 --> 00:11:03.880 +And that's a wrap from me. Thank you. -- cgit v1.2.3