WEBVTT

00:00.080 --> 00:01.360
Hi. My name is Ethan,

00:01.360 --> 00:02.320
and today I'm going to be speaking 

00:02.320 --> 00:04.240
about tree-edit, which is a package

00:04.240 --> 00:06.160
which aims to bring structural editing

00:06.160 --> 00:08.320
to everyday languages.

00:08.320 --> 00:10.559
So what is structural editing?

00:10.559 --> 00:11.657
The way that we typically 

00:11.657 --> 00:12.578
write code today

00:12.578 --> 00:14.480
is working with characters, words,

00:14.480 --> 00:16.206
lines, paragraphs, and so on, 

00:16.206 --> 00:18.600
and these objects have no real relation 

00:18.600 --> 00:21.520
to the structure of programming languages.

00:21.520 --> 00:24.667
In contrast, tree-edit's editing operations

00:24.667 --> 00:26.897
map exactly to the structure 

00:26.897 --> 00:28.411
of the programming language,

00:28.411 --> 00:30.303
which is typically in a tree form

00:30.303 --> 00:32.053
with different types of nodes

00:32.053 --> 00:33.920
such as identifiers, expressions,

00:33.920 --> 00:35.957
and statements. Using this structure 

00:35.957 --> 00:37.548
can enable much more powerful 

00:37.548 --> 00:39.200
editing operations,

00:39.200 --> 00:40.769
and crucially editing operations 

00:40.769 --> 00:42.081
that map much more closely

00:42.081 --> 00:44.960
to the way that we think about code.

00:44.960 --> 00:46.140
tree-edit was inspired by 

00:46.140 --> 00:47.386
paredit and lispy, 

00:47.386 --> 00:48.320
which are two great 

00:48.320 --> 00:50.271
Lisp structural editors.

00:50.271 --> 00:52.383
However, what makes tree-edit unique

00:52.383 --> 00:54.480
is that it can work with many languages,

00:54.480 --> 00:55.759
such as some of the 

00:55.759 --> 00:59.826
more mainstream languages like C, Java,

00:59.826 --> 01:01.600
Python, and so on.

01:01.600 --> 01:03.273
So now I'm going to show off tree-edit 

01:03.273 --> 01:05.705
in action, working with a Java program.

01:05.705 --> 01:07.237
So we can see on the left,

01:07.237 --> 01:09.119
we have a syntax tree,

01:09.119 --> 01:11.560
and the node in bold is what I call

01:11.560 --> 01:13.780
the current node. So instead of

01:13.780 --> 01:15.100
the concept of a cursor,

01:15.100 --> 01:17.600
where we have a point in 2D space,

01:17.600 --> 01:20.285
we instead work with a current node

01:20.285 --> 01:22.729
which all our editing operations 

01:22.729 --> 01:23.840
take place upon.

01:23.840 --> 01:26.479
So we can move up and down,

01:26.479 --> 01:28.720
or rather side to side,

01:28.720 --> 01:31.160
move inwards down to the children

01:31.160 --> 01:33.920
of the tree, back up to the parents.

01:33.920 --> 01:36.799
We can also jump to a node by its type. 

01:36.799 --> 01:38.768
So we're going to jump to

01:38.768 --> 01:40.880
a variable declaration.

01:40.880 --> 01:44.399
We can jump to an if statement.

01:44.399 --> 01:46.880
And as you might have noticed,

01:46.880 --> 01:48.360
tree-edit by default 

01:48.360 --> 01:51.337
uses a vim-style mode of editing,

01:51.337 --> 01:55.119
so it's a verb, which would be jump,

01:55.119 --> 01:56.874
and then a type, 

01:56.874 --> 02:00.799
which would be if statement.

02:00.799 --> 02:03.346
So now I'll show off 

02:03.346 --> 02:06.144
the syntax tree modification in action.

02:06.144 --> 02:08.000
So if I delete this deleteme node,

02:08.000 --> 02:10.112
we can see the node is deleted,

02:10.112 --> 02:12.049
and also the comma is removed

02:12.049 --> 02:13.920
since it's no longer needed.

02:13.920 --> 02:16.720
We can add some nodes back in.

02:16.720 --> 02:18.160
Here we just have a placeholder node

02:18.160 --> 02:20.391
called tree, which we can swap out

02:20.391 --> 02:21.875
with whatever we like. 

02:21.875 --> 02:24.560
So if we want to put in, for example,

02:24.560 --> 02:29.280
a plus or minus operator,

02:29.280 --> 02:30.879
it'll put these two TREE things here

02:30.879 --> 02:32.634
since there needs to be something there,

02:32.634 --> 02:37.360
but we can go fill them out as we like.

02:37.360 --> 02:38.595
So that's what that is. 

02:38.595 --> 02:41.920
Then I'll delete these again.

02:41.920 --> 02:43.709
Next we can see raising. 

02:43.709 --> 02:45.280
So if I raise reader, 

02:45.280 --> 02:46.160
then it will replace

02:46.160 --> 02:47.342
the outer function call

02:47.342 --> 02:48.583
with the node itself. 

02:48.583 --> 02:50.948
I could raise it again.

02:50.948 --> 02:53.363
The opposite operation to that 

02:53.363 --> 02:57.200
is wrapping. So I can wrap reader

02:57.200 --> 02:59.519
back into function call,

02:59.519 --> 03:03.009
and I could wrap this again 

03:03.009 --> 03:08.480
if I wanted to. So that is wrapping.

03:08.480 --> 03:12.640
We can also do it on a statement level,

03:12.640 --> 03:13.760
so if I want to wrap this

03:13.760 --> 03:14.480
in an if statement,

03:14.480 --> 03:17.034
I can wrap the statement,

03:17.034 --> 03:18.400
and there we go.

03:18.400 --> 03:21.280
And let's just raise it back up,

03:21.280 --> 03:23.200
raise it again.

03:23.200 --> 03:25.760
There we go. Finally, I'll show off

03:26.959 --> 03:28.720
slurping and barfing,

03:28.720 --> 03:32.256
which... a little bit gross words, 

03:32.256 --> 03:34.879
but I think it accurately describes

03:34.879 --> 03:37.519
the action, so let me just add

03:37.519 --> 03:41.120
a couple breaks here.

03:41.120 --> 03:44.748
So let's say we want 

03:44.748 --> 03:46.779
this if statement and a couple of breaks 

03:46.779 --> 03:48.319
to be inside of the while,

03:48.319 --> 03:50.959
so we can just slurp this up,

03:50.959 --> 03:52.433
and if we don't actually want them,

03:52.433 --> 03:54.528
we can barf them back out. 

03:54.528 --> 03:56.736
So that's where those words 

03:56.736 --> 03:57.840
have come from.

03:57.840 --> 04:01.120
And we can just... delete as we please.

04:01.120 --> 04:03.826
So yeah, that's a quick overview 

04:03.826 --> 04:07.360
of the tree editing plugin in action.

04:07.360 --> 04:08.900
So now I want to talk a little bit 

04:08.900 --> 04:12.080
about the implementation of tree-edit.

04:12.080 --> 04:14.400
Tree-edit uses the tree-sitter parser

04:14.400 --> 04:17.919
to convert text into a syntax tree.

04:17.919 --> 04:21.501
Tree-sitter is used by GitHub

04:21.501 --> 04:22.752
for its syntax highlighting, 

04:22.752 --> 04:25.280
and it's available in a bunch of editors,

04:25.280 --> 04:27.120
including Emacs, so it's 

04:27.120 --> 04:28.960
a fairly standard tool.

04:28.960 --> 04:30.960
However, the unique part about tree-edit 

04:30.960 --> 04:32.479
is how it performs 

04:32.479 --> 04:34.479
correct editing operations 

04:34.479 --> 04:35.919
on the syntax tree

04:35.919 --> 04:38.320
and then converts that back into text.

04:38.320 --> 04:41.759
So to do that, we use miniKanren,

04:41.759 --> 04:43.759
and miniKanren is an embedded

04:43.759 --> 04:45.120
domain-specific language 

04:45.120 --> 04:47.440
for logic programming. 

04:47.440 --> 04:50.080
So what exactly does that mean?

04:50.080 --> 04:51.280
In our case, it's just 

04:51.280 --> 04:54.240
an Emacs Lisp library called reazon,

04:54.240 --> 04:56.720
which exposes a set of macros 

04:56.720 --> 04:58.320
which enables us to program

04:58.320 --> 05:01.360
in this logic programming style.

05:01.360 --> 05:03.280
I'm not going to get into the details

05:03.280 --> 05:05.520
of how logic programming works.

05:05.520 --> 05:07.520
However, one of the most unique aspects

05:07.520 --> 05:09.919
about it is that we can define 

05:09.919 --> 05:13.600
a predicate and then figure out 

05:13.600 --> 05:15.280
all the inputs to the predicate 

05:15.280 --> 05:17.759
that would hold to be true.

05:17.759 --> 05:19.360
So in this case,

05:19.360 --> 05:21.520
we have our query variable q,

05:21.520 --> 05:24.479
which will be what the output is,

05:24.479 --> 05:29.120
and we are asking for all the values of q

05:29.120 --> 05:32.080
that pass this predicate of

05:32.080 --> 05:34.479
being set-equal to 1 2 3 4. 

05:34.479 --> 05:36.880
So if we execute this,

05:36.880 --> 05:40.080
it will take a little time...

05:40.080 --> 05:41.520
It shouldn't be taking this long. 

05:41.520 --> 05:43.280
Oh, there it goes.

05:43.280 --> 05:45.919
We can see that it's generated 

05:45.919 --> 05:47.520
a bunch of different answers 

05:47.520 --> 05:51.199
that are all set-equal to 1 2 3 4.

05:51.199 --> 05:52.880
So it's just a bunch of 

05:52.880 --> 05:57.280
different permutations of that.

05:57.280 --> 05:59.120
We can extend this notion 

05:59.120 --> 06:03.600
to a parser. In tree-edit, we've defined

06:03.600 --> 06:05.360
a parser in reazon,

06:05.360 --> 06:10.800
and we can use that parser to figure out

06:10.800 --> 06:15.919
any tokens that match the type of node 

06:15.919 --> 06:16.880
that we're trying to generate.

06:16.880 --> 06:19.600
If I execute this, we can see

06:19.600 --> 06:21.199
that reazon has generated 

06:21.199 --> 06:23.440
these five answers that match

06:23.440 --> 06:26.960
what a try statement is in Java.

06:26.960 --> 06:29.680
Here we can see we can have 

06:29.680 --> 06:31.919
an infinite amount of catches

06:31.919 --> 06:34.720
optionally ending with a finally,

06:34.720 --> 06:36.160
and we always have to start 

06:36.160 --> 06:39.039
with a try and a block.

06:39.039 --> 06:40.000
We can see this again 

06:40.000 --> 06:42.400
with an argument list.

06:42.400 --> 06:43.520
We have the opening and closing

06:43.520 --> 06:45.759
parentheses, and expressions 

06:45.759 --> 06:49.120
which are comma delimited.

06:49.120 --> 06:51.759
Now, for a more complex example, and

06:51.759 --> 06:53.680
something that is along the lines 

06:53.680 --> 06:55.199
of what's in tree-edit,

06:55.199 --> 06:57.919
is if we have this x here

06:57.919 --> 07:01.599
and we want to insert another expression,

07:01.599 --> 07:05.759
so x, y. We can assert 

07:05.759 --> 07:07.680
that there's some new tokens,

07:07.680 --> 07:10.160
and we want an expression 

07:10.160 --> 07:11.840
to be in those new tokens,

07:11.840 --> 07:13.280
and we can essentially state 

07:13.280 --> 07:15.039
where we want these new tokens to go

07:15.039 --> 07:19.759
within the old list of tokens,

07:19.759 --> 07:21.599
so replacing it 

07:21.599 --> 07:23.360
after the previous expression, 

07:23.360 --> 07:26.000
before the closed parentheses,

07:26.000 --> 07:26.880
and then we can state 

07:26.880 --> 07:28.560
that the whole thing parses. 

07:28.560 --> 07:30.080
If we run that, we can see that

07:30.080 --> 07:32.479
as we wanted earlier,

07:32.479 --> 07:37.120
which was a comma and then expression,

07:37.120 --> 07:39.120
we have that here as well.

07:39.120 --> 07:41.759
We can see this again.

07:41.759 --> 07:42.720
Here, the only change is that 

07:42.720 --> 07:45.280
we've moved the tokens to be

07:45.280 --> 07:46.240
before the expression.

07:46.240 --> 07:48.800
So we want to put an expression 

07:48.800 --> 07:50.560
before this x, so we want something

07:50.560 --> 07:52.560
like y, x,

07:52.560 --> 07:54.240
and if we execute that,

07:54.240 --> 07:57.919
we can see that it is correctly asserted

07:57.919 --> 07:59.039
that it would be an expression

07:59.039 --> 08:01.520
and then a comma afterwards.

08:01.520 --> 08:02.960
One last example is 

08:02.960 --> 08:04.400
if we have an if statement 

08:04.400 --> 08:07.759
and we want to add an extra block,

08:07.759 --> 08:11.599
we can see that it correctly figures out

08:11.599 --> 08:12.400
that we need an else

08:12.400 --> 08:13.840
in order to have another statement 

08:13.840 --> 08:16.720
in an if statement.

08:16.720 --> 08:19.759
So, next steps for tree-edit.

08:19.759 --> 08:21.039
The core of tree-edit is in place 

08:21.039 --> 08:23.120
but there's a lot of usability features 

08:23.120 --> 08:25.360
to add, and a lot of testing 

08:25.360 --> 08:26.400
that needs to be done

08:26.400 --> 08:29.599
in order to iron out any bugs that exist.

08:29.599 --> 08:30.960
I'd like to add support 

08:30.960 --> 08:35.200
for as many languages as is possible.

08:35.200 --> 08:36.240
I think my next step 

08:36.240 --> 08:38.490
will probably be Python.

08:38.490 --> 08:41.279
There's some performance improvements 

08:41.279 --> 08:44.080
that need to be made, since using this 

08:44.080 --> 08:45.519
logic programming language 

08:45.519 --> 08:47.600
is fairly intensive. 

08:47.600 --> 08:48.800
There's some optimizations 

08:48.800 --> 08:50.560
both on the library side 

08:50.560 --> 08:51.519
and on tree-edit side 

08:51.519 --> 08:53.360
that can be made. 

08:53.360 --> 08:55.519
Contributors are of course welcome,

08:55.519 --> 09:00.000
as tree-edit is an open source project.

09:00.000 --> 09:03.360
For future work, I think the prospect

09:03.360 --> 09:04.480
of voice controlled development

09:04.480 --> 09:06.240
with tree-edit is actually something

09:06.240 --> 09:07.920
that's really exciting,

09:07.920 --> 09:11.120
since syntax can be very cumbersome

09:11.120 --> 09:12.320
when you're working with

09:12.320 --> 09:14.240
voice control software.

09:14.240 --> 09:16.320
I can envision something like

09:16.320 --> 09:19.440
saying, "Jump to identifier,

09:19.440 --> 09:26.640
add plus operator, jump to if statement,

09:26.640 --> 09:30.480
wrap if statement in while."

09:30.480 --> 09:31.519
So that's something 

09:31.519 --> 09:33.519
I'd like to investigate.

09:33.519 --> 09:35.040
I also would just like to 

09:35.040 --> 09:37.279
provide the core functionality 

09:37.279 --> 09:39.120
of [tree-edit] as something

09:39.120 --> 09:40.399
that can be used as a library

09:40.399 --> 09:41.920
for other projects,

09:41.920 --> 09:43.839
such as refactoring packages,

09:43.839 --> 09:46.240
or other non-Vim-style approaches,

09:46.240 --> 09:49.200
and just making the syntax generation

09:49.200 --> 09:52.080
available for reuse.

09:52.080 --> 09:53.760
Finally, I'd like to thank 

09:53.760 --> 09:56.399
the authors of reazon 

09:56.399 --> 09:58.399
and elisp-tree-sitter, 

09:58.399 --> 10:00.185
which in turn packages 

10:00.185 --> 10:02.079
tree-sitter itself,

10:02.079 --> 10:05.440
since tree-edit relies very heavily

10:05.440 --> 10:07.680
on these two packages.

10:07.680 --> 10:08.959
I'd also like to thank 

10:08.959 --> 10:10.480
the author of lispy,

10:10.480 --> 10:12.720
since a lot of the design decisions 

10:12.720 --> 10:14.800
when it comes to the editing operations 

10:14.800 --> 10:18.560
are based very heavily on lispy.

10:18.560 --> 10:20.320
So that's the end of my talk.

10:20.320 --> 10:22.959
Thank you for watching.

10:22.959 --> 10:23.959
[captions by sachac]