summaryrefslogtreecommitdiffstats
path: root/2024/captions
diff options
context:
space:
mode:
authorEmacsConf <emacsconf-org@gnu.org>2024-12-07 10:40:14 -0500
committerEmacsConf <emacsconf-org@gnu.org>2024-12-07 10:40:14 -0500
commit0ae4048e5db76c5521e6d1d2477cef2a269f5fb0 (patch)
tree82a4e6c22890afdb4f4551dcf1287956632ca833 /2024/captions
parent9c5b377334848a57fd412da72acf7e3a1fa7c63e (diff)
downloademacsconf-wiki-0ae4048e5db76c5521e6d1d2477cef2a269f5fb0.tar.xz
emacsconf-wiki-0ae4048e5db76c5521e6d1d2477cef2a269f5fb0.zip
Automated commit
Diffstat (limited to '2024/captions')
-rw-r--r--2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main--chapters.vtt92
-rw-r--r--2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main.vtt1351
2 files changed, 1443 insertions, 0 deletions
diff --git a/2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main--chapters.vtt b/2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main--chapters.vtt
new file mode 100644
index 00000000..ef14c4ae
--- /dev/null
+++ b/2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main--chapters.vtt
@@ -0,0 +1,92 @@
+WEBVTT
+
+
+00:00:00.000 --> 00:00:17.081
+Rune
+
+00:00:17.082 --> 00:00:57.167
+The Emacs core
+
+00:00:57.168 --> 00:01:55.864
+Why create this?
+
+00:01:55.865 --> 00:03:01.314
+How does this compare to other projects?
+
+00:03:01.315 --> 00:03:32.440
+Multi-threading
+
+00:03:32.441 --> 00:03:47.647
+Multi-threading elisp
+
+00:03:47.648 --> 00:04:32.637
+No-GIL method
+
+00:04:32.638 --> 00:04:51.251
+Actors
+
+00:04:51.252 --> 00:05:34.679
+Multi-threading elisp (functions)
+
+00:05:34.680 --> 00:05:57.089
+Caveats
+
+00:05:57.090 --> 00:06:38.248
+Multi-threading elisp (data)
+
+00:06:38.249 --> 00:06:57.883
+Copy values to other threads on demands
+
+00:06:57.884 --> 00:08:11.902
+Multi-threading elisp (buffers)
+
+00:08:11.903 --> 00:08:46.918
+Would this actually be useful?
+
+00:08:46.919 --> 00:09:16.536
+Precise garbage collection
+
+00:09:16.537 --> 00:10:38.712
+How Emacs used to deal with roots
+
+00:10:38.713 --> 00:11:00.156
+Conservative stack scanning
+
+00:11:00.157 --> 00:12:38.828
+Movable objects
+
+00:12:38.829 --> 00:14:13.226
+How Rust makes precise GC easy
+
+00:14:13.227 --> 00:15:14.559
+Other Rust niceties: proc macro
+
+00:15:14.560 --> 00:16:01.040
+sum types
+
+00:16:01.041 --> 00:16:16.051
+Regex
+
+00:16:16.052 --> 00:16:27.209
+Parsers
+
+00:16:27.210 --> 00:16:58.918
+Other changes: GUI first, terminal second
+
+00:16:58.919 --> 00:17:16.304
+Off-screen cursor
+
+00:17:16.305 --> 00:17:24.439
+Image flow
+
+00:17:24.440 --> 00:18:36.344
+Testing
+
+00:18:36.345 --> 00:19:07.246
+Status
+
+00:19:07.247 --> 00:19:22.738
+Next directions
+
+00:19:22.739 --> 00:20:06.000
+How to get involved
diff --git a/2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main.vtt b/2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main.vtt
new file mode 100644
index 00000000..05826fb8
--- /dev/null
+++ b/2024/captions/emacsconf-2024-rust--an-experimental-emacs-core-in-rust--troy-hinckley--main.vtt
@@ -0,0 +1,1351 @@
+WEBVTT captioned by sachac, checked by bhavin
+
+NOTE Rune
+
+00:00:00.000 --> 00:00:05.119
+Hello, EmacsConf. My name is Troy Hinckley, and this is my
+
+00:00:05.120 --> 00:00:08.759
+talk on Rune, a Rust implementation in Emacs. We strive to be
+
+00:00:08.760 --> 00:00:11.839
+bug compatible with Emacs, so you can use the same Elisp.
+
+00:00:11.840 --> 00:00:14.879
+It's still a fairly early stage experimental project, and
+
+00:00:14.880 --> 00:00:17.081
+we have some basic things implemented.
+
+NOTE The Emacs core
+
+00:00:17.082 --> 00:00:19.946
+Before I get started, I want to talk a bit more
+
+00:00:19.947 --> 00:00:21.847
+about what the core is.
+
+00:00:21.848 --> 00:00:24.559
+So the Emacs core, it includes the runtime, the interpreter,
+
+00:00:24.560 --> 00:00:26.439
+garbage collector, everything used to run the code.
+
+00:00:26.440 --> 00:00:29.799
+It includes the GUI. It includes all the data structures.
+
+00:00:29.800 --> 00:00:31.919
+If you look underneath all the Elisp data structures,
+
+00:00:31.920 --> 00:00:33.599
+there's C code underneath there,
+
+00:00:33.600 --> 00:00:35.559
+as well as the auxiliary functions
+
+00:00:35.560 --> 00:00:39.239
+of which there's about 1500. In making this talk, I don't
+
+00:00:39.240 --> 00:00:40.919
+want to give the impression that I'm saying the core is
+
+00:00:40.920 --> 00:00:42.879
+outdated or that needs to be replaced or that it can't be
+
+00:00:42.880 --> 00:00:45.519
+evolved on its own, because clearly it has continued to
+
+00:00:45.520 --> 00:00:48.319
+evolve. If we look in just the last few years, we can see that
+
+00:00:48.320 --> 00:00:50.439
+we've added native compilation, we've added tree-sitter
+
+00:00:50.440 --> 00:00:52.759
+support, we've added color emoji, and there's work right
+
+00:00:52.760 --> 00:00:57.167
+now to add a new garbage collector to Emacs as well.
+
+NOTE Why create this?
+
+00:00:57.168 --> 00:01:01.071
+Why create this project? Emacs has a long history.
+
+00:01:01.072 --> 00:01:04.535
+It has a lot of users. It needs to support a big community.
+
+00:01:04.536 --> 00:01:06.837
+Because of that, it has to be very conservative
+
+00:01:06.838 --> 00:01:10.321
+about what things it can allow into the project.
+
+00:01:10.322 --> 00:01:11.639
+Forks like this create an
+
+00:01:11.640 --> 00:01:15.586
+opportunity to experiment and try new approaches.
+
+00:01:15.587 --> 00:01:18.799
+This is particularly a good use case for Rust because the C core,
+
+00:01:18.800 --> 00:01:20.849
+it's pretty well tested. It's been around for a long time.
+
+00:01:20.850 --> 00:01:22.959
+A lot of the bugs have been ironed out, but when you're doing a
+
+00:01:22.960 --> 00:01:26.439
+new greenfield project, it's very easy to introduce new
+
+00:01:26.440 --> 00:01:28.774
+undefined behavior and memory unsafety
+
+00:01:28.775 --> 00:01:32.376
+and stuff like that. Rust protects us from most of that,
+
+00:01:32.377 --> 00:01:34.937
+but it also gives us the ability to be fast
+
+00:01:34.938 --> 00:01:37.883
+and has a strong ecosystem behind it.
+
+00:01:37.884 --> 00:01:40.399
+Rust is also really good at multi-threading.
+
+00:01:40.400 --> 00:01:43.399
+Their phrase in the community is fearless concurrency.
+
+00:01:43.400 --> 00:01:45.559
+They should be able to write concurrent programs without
+
+00:01:45.560 --> 00:01:49.319
+having to worry about data races. It's also really high
+
+00:01:49.320 --> 00:01:51.839
+performance. It has a really good regex engine. It's known
+
+00:01:51.840 --> 00:01:55.864
+for its non-copy I/O as well.
+
+NOTE How does this compare to other projects?
+
+00:01:55.865 --> 00:01:57.479
+How does this compare to other
+
+00:01:57.480 --> 00:01:59.919
+Rust and Emacs projects, whether there's been a couple? The
+
+00:01:59.920 --> 00:02:02.799
+first is Remacs. This project was the first. It took an
+
+00:02:02.800 --> 00:02:05.519
+outside-in approach. Basically you could take a C
+
+00:02:05.520 --> 00:02:09.319
+function and replace it with a Rust function and build it
+
+00:02:09.320 --> 00:02:11.799
+together as one executable. This is pretty easy to do
+
+00:02:11.800 --> 00:02:14.639
+because they could both talk over the C ABI. You could
+
+00:02:14.640 --> 00:02:16.479
+swap out functions once at a time. They made really good
+
+00:02:16.480 --> 00:02:20.279
+progress at first, but eventually they ran into the problem
+
+00:02:20.280 --> 00:02:23.079
+that as you get down to the really core parts of it, you can't
+
+00:02:23.080 --> 00:02:25.919
+just replace one function at a time anymore, because some of
+
+00:02:25.920 --> 00:02:28.159
+that functionality is connected to other things. Like for
+
+00:02:28.160 --> 00:02:30.359
+example, you can't replace the garbage collector without
+
+00:02:30.360 --> 00:02:32.759
+replacing the entire garbage collection system. So the
+
+00:02:32.760 --> 00:02:36.279
+progress really kind of slowed down. Another issue with it
+
+00:02:36.280 --> 00:02:38.839
+was, is that they were doing a one-to-one rewrite, so they
+
+00:02:38.840 --> 00:02:41.079
+weren't adding any new features or functionality, just
+
+00:02:41.080 --> 00:02:43.879
+taking the same code and replacing it in Rust, which doesn't
+
+00:02:43.880 --> 00:02:46.801
+add any advantages in and of itself.
+
+00:02:46.802 --> 00:02:50.399
+This spawned Emacs-NG, which was kind of the spiritual successor to
+
+00:02:50.400 --> 00:02:52.746
+Remacs, where they decided to add new functionality,
+
+00:02:52.747 --> 00:02:55.808
+the biggest one being a JavaScript runtime,
+
+00:02:55.809 --> 00:02:58.230
+as well as some new renderers to Emacs.
+
+00:02:58.231 --> 00:03:01.314
+This is no longer actively developed though.
+
+NOTE Multi-threading
+
+00:03:01.315 --> 00:03:04.079
+In this project, one of the big focuses we have is
+
+00:03:04.080 --> 00:03:07.559
+on multi-threading. The C core itself is, everything is
+
+00:03:07.560 --> 00:03:09.959
+designed around being single-threaded, all the data
+
+00:03:09.960 --> 00:03:13.039
+structures and everything like that. Rust has a great
+
+00:03:13.040 --> 00:03:15.719
+concurrency story. In Rust, everything is intended to be
+
+00:03:15.720 --> 00:03:18.199
+multi-threaded. That doesn't mean that everything has to
+
+00:03:18.200 --> 00:03:20.719
+run on multiple threads, but you can't write something that
+
+00:03:20.720 --> 00:03:22.719
+is limited to only running in a single-threaded
+
+00:03:22.720 --> 00:03:25.799
+environment. So this makes it really easy to use all the
+
+00:03:25.800 --> 00:03:28.039
+existing packages and build something that is
+
+00:03:28.040 --> 00:03:30.480
+concurrency safe. which is what we've done here,
+
+00:03:30.481 --> 00:03:32.440
+and that was relatively easy to do.
+
+NOTE Multi-threading elisp
+
+00:03:32.441 --> 00:03:34.781
+But adding it to Elisp is the hard part,
+
+00:03:34.782 --> 00:03:36.502
+because we've got to come up with a good model
+
+00:03:36.503 --> 00:03:39.624
+for Lisp, and Elisp is just a giant ball
+
+00:03:39.625 --> 00:03:41.479
+of mutable state. We need to find some
+
+00:03:41.480 --> 00:03:44.566
+way to tame that so we can make workable concurrency
+
+00:03:44.567 --> 00:03:47.647
+out of it. There's really two ways you can do this.
+
+NOTE No-GIL method
+
+00:03:47.648 --> 00:03:49.268
+One is what I call the no-GIL method.
+
+00:03:49.269 --> 00:03:51.399
+This is what Python is doing, where
+
+00:03:51.400 --> 00:03:53.919
+you take all of your data structures, you make them
+
+00:03:53.920 --> 00:03:56.439
+concurrency safe, and then you just leave it up to the
+
+00:03:56.440 --> 00:03:58.119
+programmer to decide what they're going to do with it.
+
+00:03:58.120 --> 00:04:00.467
+They've got to build safe abstractions on top of that.
+
+00:04:00.468 --> 00:04:03.086
+One of the big downsides with this is that
+
+00:04:03.087 --> 00:04:05.247
+it comes with a pretty high cost.
+
+00:04:05.248 --> 00:04:07.799
+The last benchmarks I've seen is that by making
+
+00:04:07.800 --> 00:04:10.879
+everything concurrency safe in Python, single-threaded
+
+00:04:10.880 --> 00:04:15.799
+code is about 20% slower in some benchmarks.
+
+00:04:15.800 --> 00:04:19.079
+Since most code is single-threaded, this has a big
+
+00:04:19.080 --> 00:04:21.039
+performance impact for most code that isn't taking
+
+00:04:21.040 --> 00:04:23.719
+advantage of the multi-threading. The other thing is this
+
+00:04:23.720 --> 00:04:26.279
+introduces a lot of nasty concurrency bugs because you can
+
+00:04:26.280 --> 00:04:29.039
+have anything mutating any part of the data from any thread,
+
+00:04:29.040 --> 00:04:32.637
+even if you can't have memory unsafety per se.
+
+NOTE Actors
+
+00:04:32.638 --> 00:04:34.738
+The other option is actors,
+
+00:04:34.739 --> 00:04:36.639
+which are a really known way to approach this,
+
+00:04:36.640 --> 00:04:39.079
+where you trade some of that flexibility that you get
+
+00:04:39.080 --> 00:04:43.719
+with fully concurrent for more control and. Code and
+
+00:04:43.720 --> 00:04:45.839
+functions are shared between all the different threads,
+
+00:04:45.840 --> 00:04:50.599
+but data has to be passed along channels between different
+
+00:04:50.600 --> 00:04:51.251
+actors.
+
+NOTE Multi-threading elisp (functions)
+
+00:04:51.252 --> 00:04:52.919
+We want the functions to be shared, and this
+
+00:04:52.920 --> 00:04:55.159
+should be pretty easy because we don't mutate functions
+
+00:04:55.160 --> 00:05:00.119
+like we do data, except when we do. In Lisp, functions are
+
+00:05:00.120 --> 00:05:03.239
+just lists like anything else. So you can mutate them
+
+00:05:03.240 --> 00:05:06.279
+just like lists. Even if you're not talking about
+
+00:05:06.280 --> 00:05:09.159
+interpreted code, like if you have a native compiled
+
+00:05:09.160 --> 00:05:11.959
+function, you can still mutate the constants inside the
+
+00:05:11.960 --> 00:05:14.839
+function. For example, here we have a function returns a
+
+00:05:14.840 --> 00:05:17.679
+string. We take that string out, we mutate that string, and
+
+00:05:17.680 --> 00:05:23.079
+now the function returns a different string. In Rune, we
+
+00:05:23.080 --> 00:05:24.999
+enforce that all functions, their constants are
+
+00:05:25.000 --> 00:05:27.199
+immutable. You can't mutate the insides of a function. You
+
+00:05:27.200 --> 00:05:29.239
+can still swap out functions and redefine them, but you
+
+00:05:29.240 --> 00:05:32.239
+can't mutate the inside of a function. This enables them
+
+00:05:32.240 --> 00:05:34.679
+to be safely shared across threads.
+
+NOTE Caveats
+
+00:05:34.680 --> 00:05:36.400
+However, there are some caveats to this.
+
+00:05:36.401 --> 00:05:38.159
+For example, some functions actually do
+
+00:05:38.160 --> 00:05:41.439
+need to mutate their own data. The example that we run into is
+
+00:05:41.440 --> 00:05:44.839
+cl-generic. It uses a method cache. So it has to be able to
+
+00:05:44.840 --> 00:05:47.639
+update that cache. In this case, we just made a special
+
+00:05:47.640 --> 00:05:50.799
+case for this particular situation, but we don't know what
+
+00:05:50.800 --> 00:05:53.159
+more of these we're gonna run into the future where this is
+
+00:05:53.160 --> 00:05:57.089
+needed behavior to be able to mutate a function.
+
+NOTE Multi-threading elisp (data)
+
+00:05:57.090 --> 00:05:59.810
+Okay, so functions are pretty easy.
+
+00:05:59.811 --> 00:06:00.919
+They just can be shared between
+
+00:06:00.920 --> 00:06:05.159
+threads, but data can't be immutable, at least not into the
+
+00:06:05.160 --> 00:06:08.759
+model that Emacs currently has. We have two different
+
+00:06:08.760 --> 00:06:12.039
+ways to handle this. One is we require whenever you're
+
+00:06:12.040 --> 00:06:14.399
+calling some other code in a different thread, you have to
+
+00:06:14.400 --> 00:06:17.039
+send all the variables that it's going to need over to that
+
+00:06:17.040 --> 00:06:19.159
+thread. This is how you traditionally do inside actors.
+
+00:06:19.160 --> 00:06:21.919
+Any data that needs to go to a different actor needs to be sent
+
+00:06:21.920 --> 00:06:25.519
+over a channel. It's relatively easy implementation, but
+
+00:06:25.520 --> 00:06:28.159
+this is difficult in the Emacs case because everything is
+
+00:06:28.160 --> 00:06:30.799
+going to be accessing different variables. That means
+
+00:06:30.800 --> 00:06:33.119
+when you call something, you have to know ahead of time, all
+
+00:06:33.120 --> 00:06:34.879
+the different variables that are gonna be accessed inside
+
+00:06:34.880 --> 00:06:38.248
+that other thread and put those in when you call it.
+
+NOTE Copy values to other threads on demands
+
+00:06:38.249 --> 00:06:40.959
+The other option we're using is we're copying values to the
+
+00:06:40.960 --> 00:06:43.439
+other threads on demand. If you're running a thread, it
+
+00:06:43.440 --> 00:06:45.759
+tries to look up a variable. It doesn't have any value for
+
+00:06:45.760 --> 00:06:48.759
+that variable. It will go back and ask the main thread and it
+
+00:06:48.760 --> 00:06:50.959
+will copy that value into that thread and it can continue
+
+00:06:50.960 --> 00:06:53.399
+execution. This is nice because you can just launch some
+
+00:06:53.400 --> 00:06:55.639
+code and it'll take care of handling all the data transfer
+
+00:06:55.640 --> 00:06:57.883
+for you.
+
+NOTE Multi-threading elisp (buffers)
+
+00:06:57.884 --> 00:07:00.359
+But we don't want to be copying around is buffers,
+
+00:07:00.360 --> 00:07:04.199
+because they can be really large. In this case, we have a
+
+00:07:04.200 --> 00:07:07.599
+mutex. Each thread could only have one current buffer that
+
+00:07:07.600 --> 00:07:12.279
+it has an exclusive lock to. This comes with some
+
+00:07:12.280 --> 00:07:16.079
+trade-offs, big one being that if the user tries to access
+
+00:07:16.080 --> 00:07:18.359
+some buffer, they want to type something, and a background
+
+00:07:18.360 --> 00:07:20.239
+thread is holding onto that buffer, what do we do in that
+
+00:07:20.240 --> 00:07:24.959
+situation? And we still need to hold an exclusive lock, even
+
+00:07:24.960 --> 00:07:26.359
+if we're only going to read a buffer. If you have multiple
+
+00:07:26.360 --> 00:07:29.159
+readers, they each still need to take turns because we can't
+
+00:07:29.160 --> 00:07:30.999
+determine if at some point a thread is going to try and mutate
+
+00:07:31.000 --> 00:07:33.879
+the buffer. It has to be an exclusive lock. The other issue
+
+00:07:33.880 --> 00:07:37.799
+is buffer-locals. This is less of a implementation issue
+
+00:07:37.800 --> 00:07:40.519
+as much as it is a technical issue. Because you think about
+
+00:07:40.520 --> 00:07:42.759
+when we switch to a buffer, it has some buffer-local data and
+
+00:07:42.760 --> 00:07:45.399
+we have some thread-local data. As we go through, we're
+
+00:07:45.400 --> 00:07:47.599
+mutating everything. Those can get intertwined and
+
+00:07:47.600 --> 00:07:49.719
+pointing to each other. Then we switch away from that
+
+00:07:49.720 --> 00:07:51.679
+buffer. We need some quick way to be able to separate those
+
+00:07:51.680 --> 00:07:54.279
+out. The buffer-locals can go with the buffer-locals and
+
+00:07:54.280 --> 00:07:56.439
+the thread data can stay with thread data and make copies of
+
+00:07:56.440 --> 00:07:58.719
+anything that was pointing to the other side. But we don't
+
+00:07:58.720 --> 00:08:02.839
+have a good method to determine how to separate those two,
+
+00:08:02.840 --> 00:08:05.359
+like what data belongs to this and what data belongs to this,
+
+00:08:05.360 --> 00:08:08.199
+so that we can do that quickly. We haven't found a good
+
+00:08:08.200 --> 00:08:09.599
+solution to that yet, but it's something we're still
+
+00:08:09.600 --> 00:08:11.902
+working on.
+
+NOTE Would this actually be useful?
+
+00:08:11.903 --> 00:08:13.079
+The question is, would this actually be
+
+00:08:13.080 --> 00:08:15.959
+useful for doing real work inside Emacs? I would say,
+
+00:08:15.960 --> 00:08:17.959
+yes, there's a lot of things you can do with this. You could
+
+00:08:17.960 --> 00:08:20.239
+handle process output in the background. You can do syntax
+
+00:08:20.240 --> 00:08:23.479
+highlighting. You can do buffer search in parallel. You can
+
+00:08:23.480 --> 00:08:26.679
+do LSP. You can do fetching your mail in the background. You
+
+00:08:26.680 --> 00:08:29.639
+can have a window manager that doesn't block your window
+
+00:08:29.640 --> 00:08:32.319
+manager when Emacs is blocked. You could do
+
+00:08:32.320 --> 00:08:34.479
+something like a file system watcher that keeps up on files
+
+00:08:34.480 --> 00:08:37.559
+without blocking Emacs. This wouldn't be so great for
+
+00:08:37.560 --> 00:08:39.159
+building concurrent data structures or operating on
+
+00:08:39.160 --> 00:08:42.199
+shared data or building your own abstractions, because of the
+
+00:08:42.200 --> 00:08:46.039
+trade-offs that we've made here. Okay. That's talking
+
+00:08:46.040 --> 00:08:46.918
+about multi-threading.
+
+NOTE Precise garbage collection
+
+00:08:46.919 --> 00:08:47.599
+The other thing we're going to talk
+
+00:08:47.600 --> 00:08:51.319
+about is precise garbage collection. In Rune, we have a
+
+00:08:51.320 --> 00:08:54.439
+safe, precise garbage collection because of the Rust type
+
+00:08:54.440 --> 00:08:58.119
+system. Let's look at what the problem is with garbage
+
+00:08:58.120 --> 00:09:00.479
+collection in the first place. Really, the tricky part
+
+00:09:00.480 --> 00:09:03.719
+about garbage collection is rooting. How do we find out what
+
+00:09:03.720 --> 00:09:06.159
+the roots are? These are all the values that are on the
+
+00:09:06.160 --> 00:09:08.679
+stack or inside the registers. In this example here, we
+
+00:09:08.680 --> 00:09:11.919
+allocate an object. We call garbage_collect, that object's
+
+00:09:11.920 --> 00:09:13.536
+collected, and then we try and return it.
+
+00:09:13.537 --> 00:09:16.536
+It's no longer valid.
+
+NOTE How Emacs used to deal with roots
+
+00:09:16.537 --> 00:09:19.039
+Let's look at how Emacs used to deal with this
+
+00:09:19.040 --> 00:09:22.559
+problem way back in the day. There was a system called gcpro
+
+00:09:22.560 --> 00:09:26.319
+or GC Protect, which is basically designed that every time a
+
+00:09:26.320 --> 00:09:28.919
+value needed to survive past a garbage collection point,
+
+00:09:28.920 --> 00:09:32.359
+you had to try and protect it. In order to do this, you had
+
+00:09:32.360 --> 00:09:35.439
+to declare a struct, you had to put a macro around it to root
+
+00:09:35.440 --> 00:09:37.999
+the object, and then you had to unroot it when you were done--
+
+00:09:38.000 --> 00:09:41.559
+past the garbage collection. Now the value is safe. You
+
+00:09:41.560 --> 00:09:44.039
+can see down here, I pulled these eight rules out from a
+
+00:09:44.040 --> 00:09:46.919
+really old version of the Emacs manual about all the things
+
+00:09:46.920 --> 00:09:49.279
+you had to keep track of when you were trying to use this
+
+00:09:49.280 --> 00:09:52.319
+system. All right, so there was a special handling for
+
+00:09:52.320 --> 00:09:54.639
+nested GC protects. You had to make sure the memory was
+
+00:09:54.640 --> 00:09:58.239
+initialized. You had to make sure that traps couldn't occur
+
+00:09:58.240 --> 00:10:00.839
+between allocating and when GC protect would happen. It
+
+00:10:00.840 --> 00:10:03.319
+can be tricky because you don't always know when a function
+
+00:10:03.320 --> 00:10:06.879
+that's getting called could potentially call garbage
+
+00:10:06.880 --> 00:10:10.719
+collection. So if you got something wrong, you also
+
+00:10:10.720 --> 00:10:12.719
+might not catch it for a long time because garbage
+
+00:10:12.720 --> 00:10:15.719
+collection may only get called one out of 99 times. The other
+
+00:10:15.720 --> 00:10:18.999
+99 times is just fine. That one time it happens and you
+
+00:10:19.000 --> 00:10:22.559
+can't reproduce the issue. When you do get this wrong and
+
+00:10:22.560 --> 00:10:24.439
+some, something doesn't get rooted and it gets
+
+00:10:24.440 --> 00:10:26.319
+overwritten, it generally doesn't show up right where the
+
+00:10:26.320 --> 00:10:28.799
+problem is. It gets showed up way later when you actually try
+
+00:10:28.800 --> 00:10:31.479
+and access the value and the value is invalid. You've got
+
+00:10:31.480 --> 00:10:33.639
+to track it back to where that thing did not get properly
+
+00:10:33.640 --> 00:10:37.359
+rooted. It's a huge source of bugs and very hard to
+
+00:10:37.360 --> 00:10:38.712
+maintain.
+
+NOTE Conservative stack scanning
+
+00:10:38.713 --> 00:10:40.119
+Emacs decided to go with a different path,
+
+00:10:40.120 --> 00:10:42.399
+which we call conservative stack scanning. Basically,
+
+00:10:42.400 --> 00:10:45.239
+the garbage collector just looks at the stack and all the
+
+00:10:45.240 --> 00:10:47.959
+registers and any data inside there that looks like it could
+
+00:10:47.960 --> 00:10:52.279
+be a pointer, it treats it as a pointer. This is nice because
+
+00:10:52.280 --> 00:10:54.711
+you get really easy root tracking,
+
+00:10:54.712 --> 00:10:56.113
+but it also comes with some trade-offs,
+
+00:10:56.114 --> 00:11:00.156
+mostly that your objects are no longer movable.
+
+NOTE Movable objects
+
+00:11:00.157 --> 00:11:03.079
+Why would we want movable objects in Emacs?
+
+00:11:03.080 --> 00:11:05.839
+There's a couple of different reasons. One is compaction.
+
+00:11:05.840 --> 00:11:08.199
+You can take all your heap, you can pack that on down because
+
+00:11:08.200 --> 00:11:11.239
+you can coalesce all your objects together. Another is that
+
+00:11:11.240 --> 00:11:13.239
+it's easy to implement generational garbage collection.
+
+00:11:13.240 --> 00:11:16.039
+You can just copy everything out of your minor heap into your
+
+00:11:16.040 --> 00:11:21.839
+older heap. Really, Emacs is kind of uniquely ideal for
+
+00:11:21.840 --> 00:11:24.279
+generational collection, because the typical way we
+
+00:11:24.280 --> 00:11:27.799
+interact with Emacs is as a series of commands. You execute
+
+00:11:27.800 --> 00:11:29.959
+some command, you'd execute the next command, you execute
+
+00:11:29.960 --> 00:11:33.199
+a command. It could be happening every key press, it could be
+
+00:11:33.200 --> 00:11:36.759
+happening with M-x. However long that command is, that is
+
+00:11:36.760 --> 00:11:40.959
+the ideal length for the minor collection generation, the
+
+00:11:40.960 --> 00:11:43.399
+first generation. Because once you're done with that
+
+00:11:43.400 --> 00:11:45.879
+generation, anything that's still existing is going to be
+
+00:11:45.880 --> 00:11:49.079
+around for a very long time. So that works out really well
+
+00:11:49.080 --> 00:11:52.279
+for Emacs. We want to make this a generational collector.
+
+00:11:52.280 --> 00:11:56.199
+The other thing is with object layout. We use a lot of lists
+
+00:11:56.200 --> 00:12:00.559
+inside Emacs Lisp. Every time you go to the cdr, you've
+
+00:12:00.560 --> 00:12:03.039
+got to be chasing a pointer around the heap and following
+
+00:12:03.040 --> 00:12:05.439
+that. That can potentially result in cache misses and
+
+00:12:05.440 --> 00:12:08.239
+all sorts of other things like that. So it can take a long
+
+00:12:08.240 --> 00:12:12.159
+time. It can be quite slow. But if you have the ability to move
+
+00:12:12.160 --> 00:12:16.559
+objects, you can just relocate an entire list and lay it out
+
+00:12:16.560 --> 00:12:19.168
+in an array right next to each other inside memory.
+
+00:12:19.169 --> 00:12:22.479
+So iterating over it is just as fast as iterating over an array.
+
+00:12:22.480 --> 00:12:25.421
+But you can only do that if you have movable objects.
+
+00:12:25.422 --> 00:12:28.399
+I'll point out here too, that with conservative stack scanning,
+
+00:12:28.400 --> 00:12:31.599
+it's not that all objects are immovable. It's only ones that
+
+00:12:31.600 --> 00:12:35.519
+are pointed to from the stack or from the registers that have
+
+00:12:35.520 --> 00:12:38.828
+to become immovable.
+
+NOTE How Rust makes precise GC easy
+
+00:12:38.829 --> 00:12:41.039
+Let's look at how Rust makes precise
+
+00:12:41.040 --> 00:12:44.439
+garbage collection easy. Here I have some Rust code to
+
+00:12:44.440 --> 00:12:47.279
+show kind of how the lifetime system works and what we call
+
+00:12:47.280 --> 00:12:49.879
+XOR mutability, where we can only have one mutable
+
+00:12:49.880 --> 00:12:52.879
+reference or multiple immutable references to the same
+
+00:12:52.880 --> 00:12:56.199
+thing. Here we declare a vector, we take a reference to the
+
+00:12:56.200 --> 00:12:59.199
+first element of the vector, and then we mutate the vector.
+
+00:12:59.200 --> 00:13:02.239
+Now this could potentially resize the vector and move it to a
+
+00:13:02.240 --> 00:13:04.919
+different location in memory, so that reference is no
+
+00:13:04.920 --> 00:13:07.759
+longer valid. The nice thing is, Rust catches this for
+
+00:13:07.760 --> 00:13:10.479
+us. It says, hey, this is no longer valid. This reference
+
+00:13:10.480 --> 00:13:14.519
+can't survive past when you mutated it. Okay? That's
+
+00:13:14.520 --> 00:13:17.559
+exactly what we want for a garbage collector. You can see
+
+00:13:17.560 --> 00:13:19.879
+here, we take this in a garbage collection context, we
+
+00:13:19.880 --> 00:13:23.359
+create a new context object, we add an object, we call
+
+00:13:23.360 --> 00:13:26.759
+garbage_collect, then we try and access that object. It's no
+
+00:13:26.760 --> 00:13:29.199
+longer accessible, and Rust will prevent us from trying to
+
+00:13:29.200 --> 00:13:34.839
+access that variable. So, how do we solve this? We have a
+
+00:13:34.840 --> 00:13:39.759
+root macro. We declared this root macro, it lets us take the
+
+00:13:39.760 --> 00:13:41.759
+object and let it live past garbage collection, and
+
+00:13:41.760 --> 00:13:45.319
+everything works out. The nice thing is, this root macro
+
+00:13:45.320 --> 00:13:47.799
+will get dropped when it's out of scope, so we don't have to
+
+00:13:47.800 --> 00:13:51.519
+worry about the un-gc-protect step of this. Statically,
+
+00:13:51.520 --> 00:13:55.799
+Rust will verify and tell us any object that needs to be
+
+00:13:55.800 --> 00:13:58.279
+rooted. If we try and access it, it'll tell us it's invalid.
+
+00:13:58.280 --> 00:14:00.999
+We have this root macro and then we can access it. So in
+
+00:14:01.000 --> 00:14:03.759
+that way, we have safe, precise garbage collection without
+
+00:14:03.760 --> 00:14:07.479
+any chance of introducing undefined behavior, which is
+
+00:14:07.480 --> 00:14:09.999
+really, really powerful. It's really easy because the
+
+00:14:10.000 --> 00:14:13.226
+type system will catch it all for us.
+
+NOTE Other Rust niceties: proc macro
+
+00:14:13.227 --> 00:14:15.147
+There's some other Rust niceties I want to kind of
+
+00:14:15.148 --> 00:14:16.799
+talk through that are useful, but
+
+00:14:16.800 --> 00:14:21.079
+are not, you know, star features. One is proc macros. You
+
+00:14:21.080 --> 00:14:23.679
+can see up on the top, you can see how you declare a function
+
+00:14:23.680 --> 00:14:27.359
+inside the C core. All right. You have to use the macro. You
+
+00:14:27.360 --> 00:14:29.141
+have to put the list type, the function type,
+
+00:14:29.142 --> 00:14:30.963
+the struct type, the different types of arguments
+
+00:14:30.964 --> 00:14:33.225
+or different number of arguments, the doc string,
+
+00:14:33.226 --> 00:14:36.023
+and then you can put your argument listing down inside there.
+
+00:14:36.024 --> 00:14:37.984
+On the Rust side, we just write this like we would
+
+00:14:37.985 --> 00:14:40.044
+any other Rust function. And then we put
+
+00:14:40.045 --> 00:14:41.285
+the defun proc macro on there
+
+00:14:41.286 --> 00:14:44.186
+and it takes care of everything for us behind the scenes.
+
+00:14:44.187 --> 00:14:46.407
+A couple of cool additional things we can do with this
+
+00:14:46.408 --> 00:14:48.727
+is that we don't have to make everything just an object.
+
+00:14:48.728 --> 00:14:49.759
+We can actually make things
+
+00:14:49.760 --> 00:14:54.239
+more specific types. Here we have symbols. As well as
+
+00:14:54.240 --> 00:14:56.279
+you can see subfeature, it's an optional parameter, and we
+
+00:14:56.280 --> 00:15:00.919
+just make it an option inside Rust and it automatically make
+
+00:15:00.920 --> 00:15:03.599
+it an optional inside Elisp.
+
+00:15:03.600 --> 00:15:05.181
+This makes them really easy to write.
+
+00:15:05.182 --> 00:15:06.439
+I can't take credit for this is because this is
+
+00:15:06.440 --> 00:15:09.119
+something that I saw inside Remacs and I stole from them, but
+
+00:15:09.120 --> 00:15:11.439
+it makes the functions really easy to call from each other
+
+00:15:11.440 --> 00:15:14.559
+and really easy to write as well.
+
+NOTE sum types
+
+00:15:14.560 --> 00:15:18.523
+Another thing that's really nice is sum types.
+
+00:15:18.524 --> 00:15:21.039
+In the C core, if I wanted to get a
+
+00:15:21.040 --> 00:15:23.759
+string out of an object, I would first need to check that it's
+
+00:15:23.760 --> 00:15:28.319
+a string and then dereference it as a string. But if it's not a
+
+00:15:28.320 --> 00:15:30.679
+string, I may introduce undefined behavior. So in
+
+00:15:30.680 --> 00:15:32.799
+complicated code, I have to make sure that I have always
+
+00:15:32.800 --> 00:15:34.959
+checked what type it is before I try and dereference that
+
+00:15:34.960 --> 00:15:37.879
+type. We don't have to worry about any of that inside Rust
+
+00:15:37.880 --> 00:15:41.319
+because we can untag a value and we can use their some types,
+
+00:15:41.320 --> 00:15:44.399
+basically create an enum and we can match on what the
+
+00:15:44.400 --> 00:15:47.639
+different values can be. Then we only get out the types
+
+00:15:47.640 --> 00:15:50.359
+that are viable or are actually there. So we never
+
+00:15:50.360 --> 00:15:52.159
+accidentally get something out of an object that we didn't
+
+00:15:52.160 --> 00:15:54.239
+mean to, or dereference it as something that doesn't
+
+00:15:54.240 --> 00:15:56.879
+really exist. We can just match on it and we can get out the
+
+00:15:56.880 --> 00:16:01.040
+values that we need, which is really, really powerful.
+
+NOTE Regex
+
+00:16:01.041 --> 00:16:03.639
+So there's some other Rust niceties as well working with here.
+
+00:16:03.640 --> 00:16:07.799
+One is the regex engine inside Rust is really fast, high
+
+00:16:07.800 --> 00:16:10.959
+performance. We are using that for the Elixir regex
+
+00:16:10.960 --> 00:16:14.879
+engine to give it high performance and worst-case
+
+00:16:14.880 --> 00:16:16.051
+guarantees.
+
+NOTE Parsers
+
+00:16:16.052 --> 00:16:18.599
+The other is that Rust has a lot of really good
+
+00:16:18.600 --> 00:16:21.559
+parsers for things like JSON that are no copy parsers that
+
+00:16:21.560 --> 00:16:24.719
+are high performance. We can use those inside Rune as
+
+00:16:24.720 --> 00:16:27.209
+well.
+
+NOTE Other changes: GUI first, terminal second
+
+00:16:27.210 --> 00:16:29.439
+There's a handful of other changes we're working on
+
+00:16:29.440 --> 00:16:33.119
+that are not Rust-specific, but we'd like to see. The first is
+
+00:16:33.120 --> 00:16:36.759
+being GUI first, terminal second. This means two things.
+
+00:16:36.760 --> 00:16:40.039
+First is that we have all of our key bindings. Right now
+
+00:16:40.040 --> 00:16:43.279
+inside Emacs, C-i and TAB are bound to the same key
+
+00:16:43.280 --> 00:16:45.039
+binding by default, because that's how it works inside the
+
+00:16:45.040 --> 00:16:48.119
+terminal. In the GUI, you shouldn't have that limitation.
+
+00:16:48.120 --> 00:16:52.559
+The second is that the GUI should not block when Lisp is
+
+00:16:52.560 --> 00:16:55.199
+blocked. It should be independent of that. Your GUI can
+
+00:16:55.200 --> 00:16:58.918
+still continue to operate when Lisp is running.
+
+NOTE Off-screen cursor
+
+00:16:58.919 --> 00:17:01.279
+The other is the ability to have an off-screen cursor
+
+00:17:01.280 --> 00:17:02.699
+so that you can be typing on something,
+
+00:17:02.700 --> 00:17:04.319
+you can scroll up and down and the point
+
+00:17:04.320 --> 00:17:06.719
+doesn't have to follow you where you lose your place where
+
+00:17:06.720 --> 00:17:09.399
+you were before. You don't have to intentionally set a mark.
+
+00:17:09.400 --> 00:17:11.199
+You can just scroll and then start typing and it'll go back up
+
+00:17:11.200 --> 00:17:13.879
+to where it was before, like it works in most applications.
+
+00:17:13.880 --> 00:17:16.304
+And this can be optional.
+
+NOTE Image flow
+
+00:17:16.305 --> 00:17:18.079
+The other is image flow. We want it
+
+00:17:18.080 --> 00:17:20.879
+so that you can easily flow images and you can have large
+
+00:17:20.880 --> 00:17:23.159
+images and scroll past them without jumping and you can flow
+
+00:17:23.160 --> 00:17:24.439
+text around images.
+
+NOTE Testing
+
+00:17:24.440 --> 00:17:29.799
+How are we testing this project? Because there's a lot of
+
+00:17:29.800 --> 00:17:33.159
+things that you could get wrong here. One thing we're doing
+
+00:17:33.160 --> 00:17:38.039
+is we're using ERT. Emacs ships with over 7,000 built-in
+
+00:17:38.040 --> 00:17:42.879
+tests--Elisp tests. We are using this test suite to test
+
+00:17:42.880 --> 00:17:45.079
+our project as well. We can kind of use this as a dashboard
+
+00:17:45.080 --> 00:17:47.679
+of saying how close are we to getting to parity with GNU
+
+00:17:47.680 --> 00:17:52.319
+Emacs. The other thing that we have is a tool called elprop,
+
+00:17:52.320 --> 00:17:55.279
+which is an external utility that basically tests for
+
+00:17:55.280 --> 00:17:58.719
+correctness. Because really, the correctness of Rune is
+
+00:17:58.720 --> 00:18:00.999
+whatever Emacs does, because there's no official spec on
+
+00:18:01.000 --> 00:18:04.079
+how things should behave. To do this, we can go look at
+
+00:18:04.080 --> 00:18:07.159
+the Rust function signature. We know what the arguments
+
+00:18:07.160 --> 00:18:09.319
+are, we know how many they are, and we know what types they
+
+00:18:09.320 --> 00:18:11.679
+should be. Given that information, we can generate a
+
+00:18:11.680 --> 00:18:15.279
+whole bunch of random functions feeding those types in. And
+
+00:18:15.280 --> 00:18:18.959
+then we send a copy over to Emacs, we send a copy over to Rune.
+
+00:18:18.960 --> 00:18:21.679
+They each evaluate it and they return the result and we make
+
+00:18:21.680 --> 00:18:23.519
+sure the results are the same. Then you do that for
+
+00:18:23.520 --> 00:18:26.199
+thousands of different implementations of the function.
+
+00:18:26.200 --> 00:18:29.039
+And it helps us find corner cases really easy without having
+
+00:18:29.040 --> 00:18:31.639
+to handwrite a whole bunch of different cases to test things
+
+00:18:31.640 --> 00:18:36.344
+and say, where are these two functions different?
+
+NOTE Status
+
+00:18:36.345 --> 00:18:39.359
+So the current status: we already have a multi-threaded Elixir
+
+00:18:39.360 --> 00:18:42.999
+interpreter and bytecode engine inside there. There's no
+
+00:18:43.000 --> 00:18:45.679
+actual text editor in there yet, but the primitives are
+
+00:18:45.680 --> 00:18:48.679
+there. Like you can insert text, move point around,
+
+00:18:48.680 --> 00:18:52.039
+delete text, do different things like that. But we don't
+
+00:18:52.040 --> 00:18:53.679
+have a GUI hooked up to different key bindings to actually
+
+00:18:53.680 --> 00:18:58.159
+type on. There's just a REPL to operate in. We have about
+
+00:18:58.160 --> 00:19:01.279
+250 of the 1500 built-in functions already implemented
+
+00:19:01.280 --> 00:19:04.119
+inside there. There's a lot of low-hanging fruit inside this
+
+00:19:04.120 --> 00:19:07.246
+area to still be implemented.
+
+NOTE Next directions
+
+00:19:07.247 --> 00:19:07.719
+The next directions we're
+
+00:19:07.720 --> 00:19:11.959
+working on is we're optimizing the GC. We want to make it
+
+00:19:11.960 --> 00:19:13.839
+generational. Like I said, right now, it's just a simple
+
+00:19:13.840 --> 00:19:17.359
+semi-spaced copying GC. We want to add a proper GUI. We need
+
+00:19:17.360 --> 00:19:19.599
+to implement text properties, overlays, process and job
+
+00:19:19.600 --> 00:19:22.738
+control, all that goodness right there.
+
+NOTE How to get involved
+
+00:19:22.739 --> 00:19:25.378
+How can you get involved? This is hosted on GitHub.
+
+00:19:25.379 --> 00:19:26.424
+You can come on over.
+
+00:19:26.425 --> 00:19:28.639
+If you have any ideas about how to implement something or
+
+00:19:28.640 --> 00:19:30.639
+something you'd like to see done, go ahead and just open an
+
+00:19:30.640 --> 00:19:32.799
+issue so we can have a discussion about it. We've had lots of
+
+00:19:32.800 --> 00:19:34.599
+interesting discussions with different people coming in
+
+00:19:34.600 --> 00:19:37.639
+to the GitHub repo. If you're interested in contributing,
+
+00:19:37.640 --> 00:19:40.439
+the easiest way is probably to run elprop, pick some
+
+00:19:40.440 --> 00:19:43.279
+function, run elprop on it. I promise it won't take long to
+
+00:19:43.280 --> 00:19:45.639
+find some issues, some discrepancy between Emacs and Rune,
+
+00:19:45.640 --> 00:19:48.959
+and that lets you dive into the Rust code and figure out, and
+
+00:19:48.960 --> 00:19:50.879
+the C code, and figure out what the difference is between the
+
+00:19:50.880 --> 00:19:53.119
+two. or come along and help implement your favorite
+
+00:19:53.120 --> 00:19:55.679
+functionality. This has been a really interesting project
+
+00:19:55.680 --> 00:19:58.359
+so far, and we've had a handful of different contributors on
+
+00:19:58.360 --> 00:20:01.799
+it who just kind of want to learn Rust or get more into
+
+00:20:01.800 --> 00:20:06.000
+systems-level programming. Thank you.