diff options
Diffstat (limited to '2021/captions/emacsconf-2021-structural--tree-edit-structural-editing-for-java-python-c-and-beyond--ethan-leba--main.vtt')
-rw-r--r-- | 2021/captions/emacsconf-2021-structural--tree-edit-structural-editing-for-java-python-c-and-beyond--ethan-leba--main.vtt | 817 |
1 files changed, 817 insertions, 0 deletions
diff --git a/2021/captions/emacsconf-2021-structural--tree-edit-structural-editing-for-java-python-c-and-beyond--ethan-leba--main.vtt b/2021/captions/emacsconf-2021-structural--tree-edit-structural-editing-for-java-python-c-and-beyond--ethan-leba--main.vtt new file mode 100644 index 00000000..9bba8336 --- /dev/null +++ b/2021/captions/emacsconf-2021-structural--tree-edit-structural-editing-for-java-python-c-and-beyond--ethan-leba--main.vtt @@ -0,0 +1,817 @@ +WEBVTT + +00:00.080 --> 00:01.360 +Hi. My name is Ethan, + +00:01.360 --> 00:02.320 +and today I'm going to be speaking + +00:02.320 --> 00:04.240 +about tree-edit, which is a package + +00:04.240 --> 00:06.160 +which aims to bring structural editing + +00:06.160 --> 00:08.320 +to everyday languages. + +00:08.320 --> 00:10.559 +So what is structural editing? + +00:10.559 --> 00:11.657 +The way that we typically + +00:11.657 --> 00:12.578 +write code today + +00:12.578 --> 00:14.480 +is working with characters, words, + +00:14.480 --> 00:16.206 +lines, paragraphs, and so on, + +00:16.206 --> 00:18.600 +and these objects have no real relation + +00:18.600 --> 00:21.520 +to the structure of programming languages. + +00:21.520 --> 00:24.667 +In contrast, tree-edit's editing operations + +00:24.667 --> 00:26.897 +map exactly to the structure + +00:26.897 --> 00:28.411 +of the programming language, + +00:28.411 --> 00:30.303 +which is typically in a tree form + +00:30.303 --> 00:32.053 +with different types of nodes + +00:32.053 --> 00:33.920 +such as identifiers, expressions, + +00:33.920 --> 00:35.957 +and statements. Using this structure + +00:35.957 --> 00:37.548 +can enable much more powerful + +00:37.548 --> 00:39.200 +editing operations, + +00:39.200 --> 00:40.769 +and crucially editing operations + +00:40.769 --> 00:42.081 +that map much more closely + +00:42.081 --> 00:44.960 +to the way that we think about code. + +00:44.960 --> 00:46.140 +tree-edit was inspired by + +00:46.140 --> 00:47.386 +paredit and lispy, + +00:47.386 --> 00:48.320 +which are two great + +00:48.320 --> 00:50.271 +Lisp structural editors. + +00:50.271 --> 00:52.383 +However, what makes tree-edit unique + +00:52.383 --> 00:54.480 +is that it can work with many languages, + +00:54.480 --> 00:55.759 +such as some of the + +00:55.759 --> 00:59.826 +more mainstream languages like C, Java, + +00:59.826 --> 01:01.600 +Python, and so on. + +01:01.600 --> 01:03.273 +So now I'm going to show off tree-edit + +01:03.273 --> 01:05.705 +in action, working with a Java program. + +01:05.705 --> 01:07.237 +So we can see on the left, + +01:07.237 --> 01:09.119 +we have a syntax tree, + +01:09.119 --> 01:11.560 +and the node in bold is what I call + +01:11.560 --> 01:13.780 +the current node. So instead of + +01:13.780 --> 01:15.100 +the concept of a cursor, + +01:15.100 --> 01:17.600 +where we have a point in 2D space, + +01:17.600 --> 01:20.285 +we instead work with a current node + +01:20.285 --> 01:22.729 +which all our editing operations + +01:22.729 --> 01:23.840 +take place upon. + +01:23.840 --> 01:26.479 +So we can move up and down, + +01:26.479 --> 01:28.720 +or rather side to side, + +01:28.720 --> 01:31.160 +move inwards down to the children + +01:31.160 --> 01:33.920 +of the tree, back up to the parents. + +01:33.920 --> 01:36.799 +We can also jump to a node by its type. + +01:36.799 --> 01:38.768 +So we're going to jump to + +01:38.768 --> 01:40.880 +a variable declaration. + +01:40.880 --> 01:44.399 +We can jump to an if statement. + +01:44.399 --> 01:46.880 +And as you might have noticed, + +01:46.880 --> 01:48.360 +tree-edit by default + +01:48.360 --> 01:51.337 +uses a vim-style mode of editing, + +01:51.337 --> 01:55.119 +so it's a verb, which would be jump, + +01:55.119 --> 01:56.874 +and then a type, + +01:56.874 --> 02:00.799 +which would be if statement. + +02:00.799 --> 02:03.346 +So now I'll show off + +02:03.346 --> 02:06.144 +the syntax tree modification in action. + +02:06.144 --> 02:08.000 +So if I delete this deleteme node, + +02:08.000 --> 02:10.112 +we can see the node is deleted, + +02:10.112 --> 02:12.049 +and also the comma is removed + +02:12.049 --> 02:13.920 +since it's no longer needed. + +02:13.920 --> 02:16.720 +We can add some nodes back in. + +02:16.720 --> 02:18.160 +Here we just have a placeholder node + +02:18.160 --> 02:20.391 +called tree, which we can swap out + +02:20.391 --> 02:21.875 +with whatever we like. + +02:21.875 --> 02:24.560 +So if we want to put in, for example, + +02:24.560 --> 02:29.280 +a plus or minus operator, + +02:29.280 --> 02:30.879 +it'll put these two TREE things here + +02:30.879 --> 02:32.634 +since there needs to be something there, + +02:32.634 --> 02:37.360 +but we can go fill them out as we like. + +02:37.360 --> 02:38.595 +So that's what that is. + +02:38.595 --> 02:41.920 +Then I'll delete these again. + +02:41.920 --> 02:43.709 +Next we can see raising. + +02:43.709 --> 02:45.280 +So if I raise reader, + +02:45.280 --> 02:46.160 +then it will replace + +02:46.160 --> 02:47.342 +the outer function call + +02:47.342 --> 02:48.583 +with the node itself. + +02:48.583 --> 02:50.948 +I could raise it again. + +02:50.948 --> 02:53.363 +The opposite operation to that + +02:53.363 --> 02:57.200 +is wrapping. So I can wrap reader + +02:57.200 --> 02:59.519 +back into function call, + +02:59.519 --> 03:03.009 +and I could wrap this again + +03:03.009 --> 03:08.480 +if I wanted to. So that is wrapping. + +03:08.480 --> 03:12.640 +We can also do it on a statement level, + +03:12.640 --> 03:13.760 +so if I want to wrap this + +03:13.760 --> 03:14.480 +in an if statement, + +03:14.480 --> 03:17.034 +I can wrap the statement, + +03:17.034 --> 03:18.400 +and there we go. + +03:18.400 --> 03:21.280 +And let's just raise it back up, + +03:21.280 --> 03:23.200 +raise it again. + +03:23.200 --> 03:25.760 +There we go. Finally, I'll show off + +03:26.959 --> 03:28.720 +slurping and barfing, + +03:28.720 --> 03:32.256 +which... a little bit gross words, + +03:32.256 --> 03:34.879 +but I think it accurately describes + +03:34.879 --> 03:37.519 +the action, so let me just add + +03:37.519 --> 03:41.120 +a couple breaks here. + +03:41.120 --> 03:44.748 +So let's say we want + +03:44.748 --> 03:46.779 +this if statement and a couple of breaks + +03:46.779 --> 03:48.319 +to be inside of the while, + +03:48.319 --> 03:50.959 +so we can just slurp this up, + +03:50.959 --> 03:52.433 +and if we don't actually want them, + +03:52.433 --> 03:54.528 +we can barf them back out. + +03:54.528 --> 03:56.736 +So that's where those words + +03:56.736 --> 03:57.840 +have come from. + +03:57.840 --> 04:01.120 +And we can just... delete as we please. + +04:01.120 --> 04:03.826 +So yeah, that's a quick overview + +04:03.826 --> 04:07.360 +of the tree editing plugin in action. + +04:07.360 --> 04:08.900 +So now I want to talk a little bit + +04:08.900 --> 04:12.080 +about the implementation of tree-edit. + +04:12.080 --> 04:14.400 +Tree-edit uses the tree-sitter parser + +04:14.400 --> 04:17.919 +to convert text into a syntax tree. + +04:17.919 --> 04:21.501 +Tree-sitter is used by GitHub + +04:21.501 --> 04:22.752 +for its syntax highlighting, + +04:22.752 --> 04:25.280 +and it's available in a bunch of editors, + +04:25.280 --> 04:27.120 +including Emacs, so it's + +04:27.120 --> 04:28.960 +a fairly standard tool. + +04:28.960 --> 04:30.960 +However, the unique part about tree-edit + +04:30.960 --> 04:32.479 +is how it performs + +04:32.479 --> 04:34.479 +correct editing operations + +04:34.479 --> 04:35.919 +on the syntax tree + +04:35.919 --> 04:38.320 +and then converts that back into text. + +04:38.320 --> 04:41.759 +So to do that, we use miniKanren, + +04:41.759 --> 04:43.759 +and miniKanren is an embedded + +04:43.759 --> 04:45.120 +domain-specific language + +04:45.120 --> 04:47.440 +for logic programming. + +04:47.440 --> 04:50.080 +So what exactly does that mean? + +04:50.080 --> 04:51.280 +In our case, it's just + +04:51.280 --> 04:54.240 +an Emacs Lisp library called reazon, + +04:54.240 --> 04:56.720 +which exposes a set of macros + +04:56.720 --> 04:58.320 +which enables us to program + +04:58.320 --> 05:01.360 +in this logic programming style. + +05:01.360 --> 05:03.280 +I'm not going to get into the details + +05:03.280 --> 05:05.520 +of how logic programming works. + +05:05.520 --> 05:07.520 +However, one of the most unique aspects + +05:07.520 --> 05:09.919 +about it is that we can define + +05:09.919 --> 05:13.600 +a predicate and then figure out + +05:13.600 --> 05:15.280 +all the inputs to the predicate + +05:15.280 --> 05:17.759 +that would hold to be true. + +05:17.759 --> 05:19.360 +So in this case, + +05:19.360 --> 05:21.520 +we have our query variable q, + +05:21.520 --> 05:24.479 +which will be what the output is, + +05:24.479 --> 05:29.120 +and we are asking for all the values of q + +05:29.120 --> 05:32.080 +that pass this predicate of + +05:32.080 --> 05:34.479 +being set-equal to 1 2 3 4. + +05:34.479 --> 05:36.880 +So if we execute this, + +05:36.880 --> 05:40.080 +it will take a little time... + +05:40.080 --> 05:41.520 +It shouldn't be taking this long. + +05:41.520 --> 05:43.280 +Oh, there it goes. + +05:43.280 --> 05:45.919 +We can see that it's generated + +05:45.919 --> 05:47.520 +a bunch of different answers + +05:47.520 --> 05:51.199 +that are all set-equal to 1 2 3 4. + +05:51.199 --> 05:52.880 +So it's just a bunch of + +05:52.880 --> 05:57.280 +different permutations of that. + +05:57.280 --> 05:59.120 +We can extend this notion + +05:59.120 --> 06:03.600 +to a parser. In tree-edit, we've defined + +06:03.600 --> 06:05.360 +a parser in reazon, + +06:05.360 --> 06:10.800 +and we can use that parser to figure out + +06:10.800 --> 06:15.919 +any tokens that match the type of node + +06:15.919 --> 06:16.880 +that we're trying to generate. + +06:16.880 --> 06:19.600 +If I execute this, we can see + +06:19.600 --> 06:21.199 +that reazon has generated + +06:21.199 --> 06:23.440 +these five answers that match + +06:23.440 --> 06:26.960 +what a try statement is in Java. + +06:26.960 --> 06:29.680 +Here we can see we can have + +06:29.680 --> 06:31.919 +an infinite amount of catches + +06:31.919 --> 06:34.720 +optionally ending with a finally, + +06:34.720 --> 06:36.160 +and we always have to start + +06:36.160 --> 06:39.039 +with a try and a block. + +06:39.039 --> 06:40.000 +We can see this again + +06:40.000 --> 06:42.400 +with an argument list. + +06:42.400 --> 06:43.520 +We have the opening and closing + +06:43.520 --> 06:45.759 +parentheses, and expressions + +06:45.759 --> 06:49.120 +which are comma delimited. + +06:49.120 --> 06:51.759 +Now, for a more complex example, and + +06:51.759 --> 06:53.680 +something that is along the lines + +06:53.680 --> 06:55.199 +of what's in tree-edit, + +06:55.199 --> 06:57.919 +is if we have this x here + +06:57.919 --> 07:01.599 +and we want to insert another expression, + +07:01.599 --> 07:05.759 +so x, y. We can assert + +07:05.759 --> 07:07.680 +that there's some new tokens, + +07:07.680 --> 07:10.160 +and we want an expression + +07:10.160 --> 07:11.840 +to be in those new tokens, + +07:11.840 --> 07:13.280 +and we can essentially state + +07:13.280 --> 07:15.039 +where we want these new tokens to go + +07:15.039 --> 07:19.759 +within the old list of tokens, + +07:19.759 --> 07:21.599 +so replacing it + +07:21.599 --> 07:23.360 +after the previous expression, + +07:23.360 --> 07:26.000 +before the closed parentheses, + +07:26.000 --> 07:26.880 +and then we can state + +07:26.880 --> 07:28.560 +that the whole thing parses. + +07:28.560 --> 07:30.080 +If we run that, we can see that + +07:30.080 --> 07:32.479 +as we wanted earlier, + +07:32.479 --> 07:37.120 +which was a comma and then expression, + +07:37.120 --> 07:39.120 +we have that here as well. + +07:39.120 --> 07:41.759 +We can see this again. + +07:41.759 --> 07:42.720 +Here, the only change is that + +07:42.720 --> 07:45.280 +we've moved the tokens to be + +07:45.280 --> 07:46.240 +before the expression. + +07:46.240 --> 07:48.800 +So we want to put an expression + +07:48.800 --> 07:50.560 +before this x, so we want something + +07:50.560 --> 07:52.560 +like y, x, + +07:52.560 --> 07:54.240 +and if we execute that, + +07:54.240 --> 07:57.919 +we can see that it is correctly asserted + +07:57.919 --> 07:59.039 +that it would be an expression + +07:59.039 --> 08:01.520 +and then a comma afterwards. + +08:01.520 --> 08:02.960 +One last example is + +08:02.960 --> 08:04.400 +if we have an if statement + +08:04.400 --> 08:07.759 +and we want to add an extra block, + +08:07.759 --> 08:11.599 +we can see that it correctly figures out + +08:11.599 --> 08:12.400 +that we need an else + +08:12.400 --> 08:13.840 +in order to have another statement + +08:13.840 --> 08:16.720 +in an if statement. + +08:16.720 --> 08:19.759 +So, next steps for tree-edit. + +08:19.759 --> 08:21.039 +The core of tree-edit is in place + +08:21.039 --> 08:23.120 +but there's a lot of usability features + +08:23.120 --> 08:25.360 +to add, and a lot of testing + +08:25.360 --> 08:26.400 +that needs to be done + +08:26.400 --> 08:29.599 +in order to iron out any bugs that exist. + +08:29.599 --> 08:30.960 +I'd like to add support + +08:30.960 --> 08:35.200 +for as many languages as is possible. + +08:35.200 --> 08:36.240 +I think my next step + +08:36.240 --> 08:38.490 +will probably be Python. + +08:38.490 --> 08:41.279 +There's some performance improvements + +08:41.279 --> 08:44.080 +that need to be made, since using this + +08:44.080 --> 08:45.519 +logic programming language + +08:45.519 --> 08:47.600 +is fairly intensive. + +08:47.600 --> 08:48.800 +There's some optimizations + +08:48.800 --> 08:50.560 +both on the library side + +08:50.560 --> 08:51.519 +and on tree-edit side + +08:51.519 --> 08:53.360 +that can be made. + +08:53.360 --> 08:55.519 +Contributors are of course welcome, + +08:55.519 --> 09:00.000 +as tree-edit is an open source project. + +09:00.000 --> 09:03.360 +For future work, I think the prospect + +09:03.360 --> 09:04.480 +of voice controlled development + +09:04.480 --> 09:06.240 +with tree-edit is actually something + +09:06.240 --> 09:07.920 +that's really exciting, + +09:07.920 --> 09:11.120 +since syntax can be very cumbersome + +09:11.120 --> 09:12.320 +when you're working with + +09:12.320 --> 09:14.240 +voice control software. + +09:14.240 --> 09:16.320 +I can envision something like + +09:16.320 --> 09:19.440 +saying, "Jump to identifier, + +09:19.440 --> 09:26.640 +add plus operator, jump to if statement, + +09:26.640 --> 09:30.480 +wrap if statement in while." + +09:30.480 --> 09:31.519 +So that's something + +09:31.519 --> 09:33.519 +I'd like to investigate. + +09:33.519 --> 09:35.040 +I also would just like to + +09:35.040 --> 09:37.279 +provide the core functionality + +09:37.279 --> 09:39.120 +of [tree-edit] as something + +09:39.120 --> 09:40.399 +that can be used as a library + +09:40.399 --> 09:41.920 +for other projects, + +09:41.920 --> 09:43.839 +such as refactoring packages, + +09:43.839 --> 09:46.240 +or other non-Vim-style approaches, + +09:46.240 --> 09:49.200 +and just making the syntax generation + +09:49.200 --> 09:52.080 +available for reuse. + +09:52.080 --> 09:53.760 +Finally, I'd like to thank + +09:53.760 --> 09:56.399 +the authors of reazon + +09:56.399 --> 09:58.399 +and elisp-tree-sitter, + +09:58.399 --> 10:00.185 +which in turn packages + +10:00.185 --> 10:02.079 +tree-sitter itself, + +10:02.079 --> 10:05.440 +since tree-edit relies very heavily + +10:05.440 --> 10:07.680 +on these two packages. + +10:07.680 --> 10:08.959 +I'd also like to thank + +10:08.959 --> 10:10.480 +the author of lispy, + +10:10.480 --> 10:12.720 +since a lot of the design decisions + +10:12.720 --> 10:14.800 +when it comes to the editing operations + +10:14.800 --> 10:18.560 +are based very heavily on lispy. + +10:18.560 --> 10:20.320 +So that's the end of my talk. + +10:20.320 --> 10:22.959 +Thank you for watching. + +10:22.959 --> 10:23.959 +[captions by sachac] |