diff options
3 files changed, 1656 insertions, 3 deletions
diff --git a/2025/captions/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.vtt b/2025/captions/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.vtt new file mode 100644 index 00000000..46820e94 --- /dev/null +++ b/2025/captions/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.vtt @@ -0,0 +1,1238 @@ +WEBVTT captioned by kana + + +00:00:01.200 --> 00:00:02.803 +Hello! This is Kana! + +00:00:02.903 --> 00:00:04.367 +And today I'll be talking about + +00:00:04.368 --> 00:00:06.067 +<b>J</b>ust-<b>I</b>n-<b>T</b>ime compilation, or JIT, + +00:00:06.068 --> 00:00:07.363 +for Emacs Lisp, + +00:00:07.463 --> 00:00:11.163 +based on my work-in-progress Emacs clone, Juicemacs. + +00:00:11.263 --> 00:00:13.533 +Juicemacs aims to explore a few things + +00:00:13.534 --> 00:00:15.843 +that I've been wondering about for a while. + +00:00:15.943 --> 00:00:18.567 +For exmaple, what if we had better or even + +00:00:18.568 --> 00:00:21.223 +transparent concurrency in ELisp? + +00:00:21.323 --> 00:00:23.243 +Or, can we have a concurrent GUI? + +00:00:23.343 --> 00:00:26.783 +One that does not block, or is blocked by Lisp code? + +00:00:26.883 --> 00:00:31.067 +And finally what can JIT compilation do for ELisp? + +00:00:31.068 --> 00:00:34.083 +Will it provide better performance? + +00:00:34.183 --> 00:00:37.400 +However, a main problem with explorations + +00:00:37.401 --> 00:00:38.623 +in Emacs clones is that, + +00:00:38.723 --> 00:00:40.863 +Emacs is a whole universe. + +00:00:40.963 --> 00:00:43.600 +And that means, to make these explorations + +00:00:43.601 --> 00:00:45.383 +meaningful for Emacs users, + +00:00:45.483 --> 00:00:47.967 +we need to cover a lot of Emacs features, + +00:00:47.968 --> 00:00:50.543 +before we can ever begin. + +00:00:50.643 --> 00:00:53.923 +For example, one of the features of Emacs is that, + +00:00:54.023 --> 00:00:56.003 +it supports a lot of encodings. + +00:00:56.103 --> 00:00:59.267 +Let's look at this string: it can be encoded + +00:00:59.268 --> 00:01:03.643 +in both Unicode and Shift-JIS, a Japanese encoding system. + +00:01:03.743 --> 00:01:07.067 +But currently, Unicode does not have + +00:01:07.068 --> 00:01:09.803 +an official mapping for this "ki" (﨑) character. + +00:01:09.903 --> 00:01:12.767 +So when we map from Shift-JIS to Unicode, + +00:01:12.768 --> 00:01:14.423 +in most programming languages, + +00:01:14.523 --> 00:01:16.533 +you end up with something like this: + +00:01:16.534 --> 00:01:19.143 +it's a replacement character. + +00:01:19.243 --> 00:01:22.067 +But in Emacs, it actually extends + +00:01:22.068 --> 00:01:23.883 +the Unicode range by threefold, + +00:01:23.983 --> 00:01:26.833 +and uses the extra range to losslessly + +00:01:26.834 --> 00:01:29.483 +support characters like this. + +00:01:29.583 --> 00:01:31.923 +So if you want to support this feature, + +00:01:32.023 --> 00:01:34.033 +that basically rules out all string + +00:01:34.034 --> 00:01:37.243 +libraries with Unicode assumptions. + +00:01:37.843 --> 00:01:40.067 +For another, you need to support + +00:01:40.068 --> 00:01:41.883 +the regular expressions in Emacs, + +00:01:41.983 --> 00:01:45.023 +which are, really irregular. + +00:01:45.123 --> 00:01:46.900 +For example, it supports asserting + +00:01:46.901 --> 00:01:49.403 +about the user cursor position. + +00:01:49.503 --> 00:01:52.033 +And it also uses some character tables, + +00:01:52.034 --> 00:01:53.883 +that can be modified from Lisp code, + +00:01:53.983 --> 00:01:56.163 +to determine to case mappings. + +00:01:56.263 --> 00:01:59.567 +And all that makes it really hard, or even + +00:01:59.568 --> 00:02:05.123 +impossible to use any existing regexp libraries. + +00:02:05.223 --> 00:02:07.883 +Also, you need a functional garbage collector. + +00:02:07.983 --> 00:02:09.867 +You need threading primitives, because + +00:02:09.868 --> 00:02:12.323 +Emacs has already had some threading support. + +00:02:12.423 --> 00:02:14.533 +And you might want the performance of your clone + +00:02:14.534 --> 00:02:18.963 +to match Emacs, even with its native compilation enabled. + +00:02:19.063 --> 00:02:21.500 +Not to mention you also need a GUI for an editor. + +00:02:21.501 --> 00:02:23.543 +And so on. + +00:02:23.643 --> 00:02:25.633 +For Juicemacs, building on Java and + +00:02:25.634 --> 00:02:27.563 +a compiler framework called Truffle, + +00:02:27.663 --> 00:02:30.503 +helps in getting better performance; + +00:02:30.603 --> 00:02:32.933 +and by choosing a language with a good GC, + +00:02:32.934 --> 00:02:38.063 +we can actually focus more on the challenges above. + +00:02:38.163 --> 00:02:41.433 +Currently, Juicemacs has implemented three out of, + +00:02:41.434 --> 00:02:43.983 +at least four of the interpreters in Emacs. + +00:02:44.083 --> 00:02:46.363 +One for lisp code, one for bytecode, + +00:02:46.463 --> 00:02:48.567 +and one for regular expressions, + +00:02:48.568 --> 00:02:50.903 +all of them JIT-capable. + +00:02:51.003 --> 00:02:53.667 +Other than these, Emacs also has around + +00:02:53.668 --> 00:02:56.083 +two thousand built-in functions in C code. + +00:02:56.183 --> 00:02:57.333 +And Juicemacs has around + +00:02:57.334 --> 00:02:59.763 +four hundred of them implemented. + +00:02:59.863 --> 00:03:03.603 +It's not that many, but it is surprisingly enough + +00:03:03.703 --> 00:03:05.200 +to bootstrap Emacs and run + +00:03:05.201 --> 00:03:08.483 +the portable dumper, or pdump, in short. + +00:03:08.583 --> 00:03:11.243 +Let's have a try. + +00:03:11.343 --> 00:03:11.703 + + +00:03:11.803 --> 00:03:14.923 +So this is the binary produced by Java native image. + +00:03:15.023 --> 00:03:17.167 +And it's loading all the files + +00:03:17.168 --> 00:03:18.763 +needed for bootstrapping. + +00:03:18.863 --> 00:03:22.233 +Then it dumps the memory to a file to + +00:03:22.234 --> 00:03:24.923 +be loaded later, giving us fast startup. + +00:03:25.023 --> 00:03:28.723 +As we can see here, it throws some frame errors + +00:03:28.823 --> 00:03:31.400 +because Juicemacs doesn't have an editor UI + +00:03:31.401 --> 00:03:33.283 +or functional frames yet. + +00:03:33.383 --> 00:03:35.367 +But otherwise, it can already run + +00:03:35.368 --> 00:03:36.643 +quite some lisp code. + +00:03:36.743 --> 00:03:40.400 +For example, this code uses the benchmark library + +00:03:40.401 --> 00:03:44.403 +to measure the performance of this Fibonacci function. + +00:03:44.503 --> 00:03:47.067 +And we can see here, the JIT engine is + +00:03:47.068 --> 00:03:51.163 +already kicking in and makes the execution faster. + +00:03:51.263 --> 00:03:53.483 +In addition to that, with a bit of workaround, + +00:03:53.583 --> 00:03:56.467 +Juicemacs can also run some of the ERT, + +00:03:56.468 --> 00:04:01.043 +or, <b>E</b>macs <b>R</b>egression <b>T</b>est suite, that comes with Emacs. + +00:04:01.143 --> 00:04:05.823 +So... Yes, there are a bunch of test failures, + +00:04:05.923 --> 00:04:07.933 +which means we are not that compatible + +00:04:07.934 --> 00:04:09.523 +with Emacs and need more work. + +00:04:09.623 --> 00:04:12.803 +But the whole testing procedure runs fine, + +00:04:12.903 --> 00:04:14.767 +and it has proper stack traces, + +00:04:14.768 --> 00:04:17.803 +which is quite useful for debugging Juicemacs. + +00:04:17.903 --> 00:04:21.033 +So with that, a rather functional JIT runtime, + +00:04:21.034 --> 00:04:25.983 +let's now try look into today's topic, JIT compilation for ELisp. + +00:04:26.083 --> 00:04:28.533 +So, you probably know that Emacs has supported + +00:04:28.534 --> 00:04:32.083 +native-compilation, or nativecomp in short, for some time now. + +00:04:32.183 --> 00:04:35.033 +It mainly uses GCC to compile Lisp code + +00:04:35.034 --> 00:04:37.363 +into native code, ahead of time. + +00:04:37.463 --> 00:04:41.433 +And during runtime, Emacs loads those compiled files, + +00:04:41.434 --> 00:04:44.523 +and gets the performance of native code. + +00:04:44.623 --> 00:04:47.643 +However, for example, for installed packages, + +00:04:47.743 --> 00:04:49.059 +we might want to compile them when we + +00:04:49.060 --> 00:04:51.823 +actually use them instead of ahead of time. + +00:04:51.923 --> 00:04:53.733 +And Emacs supports this through + +00:04:53.734 --> 00:04:55.683 +this <i>native-comp-jit-compilation</i> flag. + +00:04:55.783 --> 00:04:59.767 +What it does is, during runtime, Emacs sends + +00:04:59.768 --> 00:05:03.203 +loaded files to external Emacs worker processes, + +00:05:03.303 --> 00:05:06.903 +which will then compile those files asynchronously. + +00:05:07.003 --> 00:05:09.043 +And when the compilation is done, + +00:05:09.143 --> 00:05:11.967 +the current Emacs session will load the compiled code back + +00:05:11.968 --> 00:05:16.323 +and improves its performance, on the fly. + +00:05:16.423 --> 00:05:18.643 +When you look at this procedure, however, it is, + +00:05:18.743 --> 00:05:21.563 +ahead-of-time compilation, done at runtime. + +00:05:21.663 --> 00:05:25.123 +And it is what current Emacs calls JIT compilation. + +00:05:25.223 --> 00:05:27.867 +But if you look at some other JIT engines, + +00:05:27.868 --> 00:05:31.803 +you'll see much more complex architectures. + +00:05:31.903 --> 00:05:34.233 +So, take luaJIT for an example, + +00:05:34.234 --> 00:05:36.163 +in addition to this red line here, + +00:05:36.263 --> 00:05:38.767 +which leads us from an interpreted state + +00:05:38.768 --> 00:05:40.643 +to a compiled native state, + +00:05:40.743 --> 00:05:42.163 +which is also what Emacs does, + +00:05:42.263 --> 00:05:44.333 +LuaJIT also supports going from + +00:05:44.334 --> 00:05:47.523 +a compiled state back to its interpreter. + +00:05:47.623 --> 00:05:51.483 +And this process is called "deoptimization". + +00:05:51.583 --> 00:05:55.300 +In contrast to its name, deoptimization here actually + +00:05:55.301 --> 00:05:58.563 +enables a huge category of JIT optimizations. + +00:05:58.663 --> 00:06:00.163 +They are called speculation. + +00:06:01.463 --> 00:06:04.600 +Basically, with speculation, the compiler + +00:06:04.601 --> 00:06:07.683 +can use runtime statistics to speculate, + +00:06:07.783 --> 00:06:11.443 +to make bolder assumptions in the compiled code. + +00:06:11.543 --> 00:06:13.983 +And when the assumptions are invalidated, + +00:06:14.083 --> 00:06:18.323 +the runtime deoptimizes the code, updates statistics, + +00:06:18.423 --> 00:06:21.133 +and then recompile the code based on new assumptions, + +00:06:21.134 --> 00:06:24.443 +and that will make the code more performant. + +00:06:24.543 --> 00:06:26.763 +Let's look at an example. + +00:06:28.463 --> 00:06:30.967 +So, here is a really simple function, + +00:06:30.968 --> 00:06:33.083 +that adds one to the input number. + +00:06:33.183 --> 00:06:36.167 +But in Emacs, it is not that simple, + +00:06:36.168 --> 00:06:38.203 +because Emacs has three categories of numbers, + +00:06:38.303 --> 00:06:42.700 +that is, fix numbers, or machine-word-sized integers, + +00:06:42.701 --> 00:06:45.603 +floating numbers, and big integers. + +00:06:45.703 --> 00:06:47.600 +And when we compile this, we need + +00:06:47.601 --> 00:06:49.363 +to handle all three cases. + +00:06:49.463 --> 00:06:52.600 +And if we analyze the code produced by Emacs, + +00:06:52.601 --> 00:06:54.683 +as is shown by this gray graph here, + +00:06:54.783 --> 00:06:58.083 +we can see that it has, two paths: + +00:06:58.183 --> 00:07:01.403 +One fast path, that does fast fix number addition; + +00:07:01.503 --> 00:07:03.967 +and one for slow paths, that calls out + +00:07:03.968 --> 00:07:06.523 +to an external plus-one function, + +00:07:06.623 --> 00:07:09.683 +to handle floating number and big integers. + +00:07:09.783 --> 00:07:13.167 +Now, if we pass integers into this function, + +00:07:13.168 --> 00:07:16.283 +it's pretty fast because it's on the fast path. + +00:07:16.383 --> 00:07:19.767 +However, if we pass in a floating number, + +00:07:19.768 --> 00:07:21.843 +then it has to go through the slow path, + +00:07:21.943 --> 00:07:25.563 +doing an extra function call, which is slow. + +00:07:25.663 --> 00:07:28.733 +What speculation might help here is that, + +00:07:28.734 --> 00:07:31.443 +it can have flexible fast paths. + +00:07:31.543 --> 00:07:34.563 +When we pass a floating number into this function, + +00:07:34.663 --> 00:07:37.400 +which currently has only fixnumbers on the fast path, + +00:07:37.401 --> 00:07:40.723 +it also has to go through the slow path. + +00:07:40.823 --> 00:07:44.567 +But the difference is that, a speculative runtime can + +00:07:44.568 --> 00:07:47.763 +deoptimize and recompile the code to adapt to this. + +00:07:47.863 --> 00:07:50.367 +And when it recompiles, it might add + +00:07:50.368 --> 00:07:52.643 +floating number onto the fast path, + +00:07:52.743 --> 00:07:55.003 +and now floating number operations are also fast. + +00:07:55.103 --> 00:07:58.567 +And this kind of speculation is why + +00:07:58.568 --> 00:08:03.603 +speculative runtime can be really fast. + +00:08:03.703 --> 00:08:05.723 +Let's take a look at some benchmarks. + +00:08:05.823 --> 00:08:09.423 +They're obtained with the <i>elisp-benchmarks</i> library on ELPA. + +00:08:09.523 --> 00:08:12.600 +The blue line here is for nativecomp, + +00:08:12.601 --> 00:08:16.043 +and these blue areas mean that nativecomp is slower. + +00:08:16.143 --> 00:08:19.133 +And, likewise, green areas mean that + +00:08:19.134 --> 00:08:20.523 +Juicemacs is slower. + +00:08:20.623 --> 00:08:22.867 +At a glance, the two (or four) + +00:08:22.868 --> 00:08:25.143 +actually seems somehow on par, to me. + +00:08:25.243 --> 00:08:30.383 +But, let's take a closer look at some of them. + +00:08:30.483 --> 00:08:32.667 +So, the first few benchmarks are the classic, + +00:08:32.668 --> 00:08:33.983 +Fibonacci benchmarks. + +00:08:34.083 --> 00:08:36.933 +We know that, the series is formed by + +00:08:36.934 --> 00:08:39.203 +adding the previous two numbers in the series. + +00:08:39.303 --> 00:08:41.700 +And looking at this expression here, + +00:08:41.701 --> 00:08:44.043 +Fibonacci benchmarks are quite intensive + +00:08:44.143 --> 00:08:46.800 +in number additions, subtractions, + +00:08:46.801 --> 00:08:49.103 +and function calls, if you use recursions. + +00:08:49.203 --> 00:08:51.000 +And it is exactly why + +00:08:51.001 --> 00:08:54.323 +Fibonacci series is a good benchmark. + +00:08:54.423 --> 00:08:57.243 +And looking at the results here... wow. + +00:08:57.343 --> 00:08:59.843 +Emacs nativecomp executes instantaneously. + +00:08:59.943 --> 00:09:04.523 +It's a total defeat for Juicemacs, seemingly. + +00:09:04.623 --> 00:09:08.043 +Now, if you're into benchmarks, you know something is wrong here: + +00:09:08.143 --> 00:09:11.683 +we are comparing the different things. + +00:09:11.783 --> 00:09:14.200 +So let's look under the hood + +00:09:14.201 --> 00:09:15.483 +and disassemble the function + +00:09:15.583 --> 00:09:17.567 +with this convenient Emacs command + +00:09:17.568 --> 00:09:19.063 +called <i>disassemble</i>... + +00:09:19.163 --> 00:09:23.043 +And these two lines of code is what we got. + +00:09:23.143 --> 00:09:24.700 +So, we already can see + +00:09:24.701 --> 00:09:26.123 +what's going on here: + +00:09:26.223 --> 00:09:29.963 +GCC sees Fibonacci is a pure function, + +00:09:30.063 --> 00:09:31.867 +because it returns the same value + +00:09:31.868 --> 00:09:33.243 +for the same arguments, + +00:09:33.343 --> 00:09:35.700 +so GCC chooses to do the computation + +00:09:35.701 --> 00:09:36.723 +at compile time + +00:09:36.823 --> 00:09:39.133 +and inserts the final number directly + +00:09:39.134 --> 00:09:40.323 +into the compiled code. + +00:09:41.823 --> 00:09:43.603 +It is actually great! + +00:09:43.703 --> 00:09:45.400 +Because it shows that nativecomp + +00:09:45.401 --> 00:09:47.283 +knows about pure functions, + +00:09:47.383 --> 00:09:48.700 +and can do all kinds of things + +00:09:48.701 --> 00:09:51.203 +like removing or constant-folding them. + +00:09:51.303 --> 00:09:54.403 +And Juicemacs just does not do that. + +00:09:54.503 --> 00:09:57.367 +However, we are also concerned about + +00:09:57.368 --> 00:09:59.003 +the things we mentioned earlier: + +00:09:59.103 --> 00:10:00.900 +the performance of number additions, + +00:10:00.901 --> 00:10:02.983 +or function calls. + +00:10:03.083 --> 00:10:05.633 +So, in order to let the benchmarks + +00:10:05.634 --> 00:10:06.863 +show some extra things, + +00:10:06.963 --> 00:10:08.367 +we need to modify it a bit... + +00:10:08.368 --> 00:10:11.323 +by simply making things non-constant. + +00:10:11.423 --> 00:10:15.203 +With that, Emacs gets much slower now. + +00:10:15.303 --> 00:10:17.133 +And again, let's look what's + +00:10:17.134 --> 00:10:21.083 +happening behind these numbers. + +00:10:21.183 --> 00:10:23.500 +Similarly, with the <i>disassemble</i> command, + +00:10:23.501 --> 00:10:25.643 +we can look into the assembly. + +00:10:25.743 --> 00:10:28.019 +And again, we can already see + +00:10:28.020 --> 00:10:29.303 +what's happening here. + +00:10:29.403 --> 00:10:32.083 +So, Juicemacs, due to its speculation nature, + +00:10:32.183 --> 00:10:35.443 +supports fast paths for all three kind of numbers. + +00:10:35.543 --> 00:10:39.233 +However, currently, Emacs nativecomp + +00:10:39.234 --> 00:10:41.243 +does not have any fast path + +00:10:41.343 --> 00:10:43.433 +for the operations here like additions, + +00:10:43.434 --> 00:10:45.803 +or subtractions, or comparisons, + +00:10:45.903 --> 00:10:48.067 +which is exactly what + +00:10:48.068 --> 00:10:50.963 +Fibonacci benchmarks are measuring. + +00:10:51.063 --> 00:10:53.800 +Emacs, at this time, has to call some generic, + +00:10:53.801 --> 00:10:57.963 +external functions for them, and this is slow. + +00:11:00.063 --> 00:11:03.203 +But is nativecomp really that slow? + +00:11:03.303 --> 00:11:04.967 +So, I also ran the same benchmark + +00:11:04.968 --> 00:11:07.083 +in Common Lisp, with SBCL. + +00:11:07.183 --> 00:11:09.000 +And nativecomp is already fast, + +00:11:09.001 --> 00:11:11.003 +compared to untyped SBCL. + +00:11:11.103 --> 00:11:15.500 +It's because SBCL also emits call instructions + +00:11:15.501 --> 00:11:18.483 +when it comes to no type info. + +00:11:18.583 --> 00:11:21.700 +However, once we declare the types, + +00:11:21.701 --> 00:11:25.283 +SBCL is able to compile a fast path for fix numbers, + +00:11:25.383 --> 00:11:27.467 +which makes its performance on par + +00:11:27.468 --> 00:11:30.683 +with speculative JIT engines (that is, Juicemacs), + +00:11:30.783 --> 00:11:34.763 +because, now both of us are now on fast paths. + +00:11:36.063 --> 00:11:38.400 +Additionally, if we are bold enough + +00:11:38.401 --> 00:11:41.203 +to pass this safety zero flag to SBCL, + +00:11:41.303 --> 00:11:43.700 +it will remove all the slow paths + +00:11:43.701 --> 00:11:44.963 +and type checks, + +00:11:45.063 --> 00:11:46.367 +and its performance is close + +00:11:46.368 --> 00:11:48.643 +to what you get with C. + +00:11:48.743 --> 00:11:51.299 +Well, probably we don't want safety zero + +00:11:51.300 --> 00:11:52.063 +most of the time. + +00:11:52.163 --> 00:11:55.133 +But even then, if nativecomp were to + +00:11:55.134 --> 00:11:57.763 +get fast paths for more constructs, + +00:11:57.863 --> 00:11:59.867 +there certainly is quite + +00:11:59.868 --> 00:12:03.563 +some room for performance improvement. + +00:12:04.063 --> 00:12:06.803 +Let's look at some more benchmarks. + +00:12:06.903 --> 00:12:08.933 +For example, for this inclist, + +00:12:08.934 --> 00:12:10.923 +or increment-list, benchmark, + +00:12:11.023 --> 00:12:14.333 +Juicemacs is really slow here. Partly, + +00:12:14.334 --> 00:12:17.603 +it comes from the cost of Java boxing integers. + +00:12:17.703 --> 00:12:20.300 +On the other hand, for Emacs nativecomp, + +00:12:20.301 --> 00:12:22.043 +for this particular benchmark, + +00:12:22.143 --> 00:12:23.667 +it actually has fast paths + +00:12:23.668 --> 00:12:25.523 +for all of the operations. + +00:12:25.623 --> 00:12:27.723 +And that's why it can be so fast, + +00:12:27.823 --> 00:12:30.667 +and that also proves the nativecomp + +00:12:30.668 --> 00:12:33.843 +has a lot potential for improvement. + +00:12:33.943 --> 00:12:35.833 +There is another benchmark here + +00:12:35.834 --> 00:12:37.963 +that use advices. + +00:12:38.063 --> 00:12:40.500 +So Emacs Lisp supports using + +00:12:40.501 --> 00:12:42.203 +advices to override functions + +00:12:42.303 --> 00:12:44.833 +by wrapping the original function, and an advice + +00:12:44.834 --> 00:12:47.443 +function, two of them, inside a glue function. + +00:12:47.543 --> 00:12:51.467 +And in this benchmark, we advice the Fibonacci function + +00:12:51.468 --> 00:12:54.523 +to cache the first ten entries to speed up computation, + +00:12:54.623 --> 00:13:00.003 +as can be seen in the speed-up in the Juicemacs results. + +00:13:00.103 --> 00:13:02.900 +However, it seems that nativecomp does not yet + +00:13:02.901 --> 00:13:08.523 +compile glue functions, and that makes advices slower. + +00:13:08.623 --> 00:13:12.043 +With these benchmarks, let's discuss this big question: + +00:13:12.143 --> 00:13:16.563 +Should GNU Emacs adopt speculative JIT compilation? + +00:13:16.663 --> 00:13:18.967 +Well, the hidden question is actually, + +00:13:18.968 --> 00:13:21.223 +is it worth it? + +00:13:21.323 --> 00:13:24.163 +And, my personal answer is, maybe not. + +00:13:24.263 --> 00:13:28.133 +The first reason is that, slow paths, like, floating numbers, + +00:13:28.134 --> 00:13:31.043 +are actually not that frequent in Emacs. + +00:13:31.143 --> 00:13:34.100 +And optimizing for fast paths like fix numbers + +00:13:34.101 --> 00:13:37.983 +can already get us very good performance already. + +00:13:38.083 --> 00:13:40.333 +And the second or main reason is that, + +00:13:40.334 --> 00:13:43.163 +speculative JIT is very hard. + +00:13:43.263 --> 00:13:46.843 +LuaJIT, for example, took a genius to build. + +00:13:46.943 --> 00:13:50.967 +Even with the help of GCC, we need to hand-write + +00:13:50.968 --> 00:13:54.283 +all those fast path or slow path or switching logic. + +00:13:54.383 --> 00:13:58.133 +We need to find a way to deoptimize, which requires + +00:13:58.134 --> 00:14:01.803 +mapping machine registers back to interpreter stack. + +00:14:01.903 --> 00:14:04.067 +And also, speculation needs runtime info, + +00:14:04.068 --> 00:14:07.323 +which also costs us extra memory. + +00:14:07.423 --> 00:14:10.763 +Moreover, as is shown by some benchmarks above, + +00:14:10.863 --> 00:14:13.333 +there's some low-hanging fruits in nativecomp that + +00:14:13.334 --> 00:14:17.343 +might get us better performance with relatively lower effort. + +00:14:17.443 --> 00:14:22.163 +Compared to this, a JIT engine is a huge, huge undertaking. + +00:14:22.263 --> 00:14:26.123 +But, for Juicemacs, the JIT engine comes a lot cheaper, + +00:14:26.223 --> 00:14:29.067 +because, we are cheating by building on + +00:14:29.068 --> 00:14:33.443 +an existing compiler framework called Truffle. + +00:14:33.543 --> 00:14:35.883 +Truffle is a meta-compiler framework, + +00:14:35.983 --> 00:14:37.633 +which means that it lets you write + +00:14:37.634 --> 00:14:40.103 +an interpreter, add required annotations, + +00:14:40.203 --> 00:14:42.500 +and it will automatically turn the + +00:14:42.501 --> 00:14:45.643 +interpreter into a JIT runtime. + +00:14:45.743 --> 00:14:49.083 +So for example, here is a typical bytecode interpreter. + +00:14:49.183 --> 00:14:51.233 +After you add the required annotations, + +00:14:51.234 --> 00:14:52.523 +Truffle will know that, + +00:14:52.623 --> 00:14:55.533 +the bytecode here is constant, and it should + +00:14:55.534 --> 00:14:59.123 +unroll this loop here, to inline all those bytecode. + +00:14:59.223 --> 00:15:00.467 +And then, when Truffle + +00:15:00.468 --> 00:15:02.243 +compiles the code, it knows that: + +00:15:02.343 --> 00:15:05.233 +the first loop here does: x plus one, + +00:15:05.234 --> 00:15:07.723 +and the second does: return. + +00:15:07.823 --> 00:15:09.533 +And then it will compile all that into, + +00:15:09.534 --> 00:15:11.363 +return x plus 1, + +00:15:11.463 --> 00:15:14.067 +which is exactly what we would expect + +00:15:14.068 --> 00:15:17.683 +when compiling this pseudo code. + +00:15:17.783 --> 00:15:21.083 +Building on that, we can also easily implement speculation, + +00:15:21.183 --> 00:15:24.867 +by using this <i>transferToInterpreterAndInvalidate</i> function + +00:15:24.868 --> 00:15:26.123 +provided by Truffle. + +00:15:26.223 --> 00:15:28.533 +And Truffle will automatically turn that + +00:15:28.534 --> 00:15:30.683 +into deoptimization. + +00:15:30.783 --> 00:15:32.700 +Now, for example, when this add function + +00:15:32.701 --> 00:15:35.723 +is supplied with, two floating numbers. + +00:15:35.823 --> 00:15:38.243 +It will go through the slow path here, + +00:15:38.343 --> 00:15:40.960 +which might lead to a compiled slow path, + +00:15:40.961 --> 00:15:43.203 +or deoptimization. + +00:15:43.303 --> 00:15:45.733 +And going this deoptimization way, + +00:15:45.734 --> 00:15:48.223 +it can then update the runtime stats. + +00:15:48.323 --> 00:15:50.400 +And now, when the code is compiled again, + +00:15:50.401 --> 00:15:51.603 +Truffle will know, + +00:15:51.703 --> 00:15:54.100 +that these compilation stats, suggests that, + +00:15:54.101 --> 00:15:55.563 +we have floating numbers. + +00:15:55.663 --> 00:15:58.733 +And this floating point addition branch will + +00:15:58.734 --> 00:16:02.603 +then be incorporated into the fast path. + +00:16:02.703 --> 00:16:06.003 +To put it into Java code... + +00:16:06.103 --> 00:16:08.723 +Most operations are just as simple as this. + +00:16:08.823 --> 00:16:11.033 +And it supports fast paths for integers, + +00:16:11.034 --> 00:16:13.963 +floating numbers, and big integers. + +00:16:14.063 --> 00:16:17.133 +And the simplicity of this not only saves us work, + +00:16:17.134 --> 00:16:22.243 +but also enables Juicemacs to explore more things more rapidly. + +00:16:22.343 --> 00:16:26.483 +And actually, I have done some silly explorations. + +00:16:26.583 --> 00:16:30.203 +For example, I tried to constant-fold more things. + +00:16:30.303 --> 00:16:32.767 +Many of us have an Emacs config that stays + +00:16:32.768 --> 00:16:36.683 +largely unchanged, at least during one Emacs session. + +00:16:36.783 --> 00:16:39.667 +And that means many of the global variables + +00:16:39.668 --> 00:16:42.323 +in ELisp are constant. + +00:16:42.423 --> 00:16:44.600 +And with speculation, we can + +00:16:44.601 --> 00:16:46.683 +speculate about the stable ones, + +00:16:46.783 --> 00:16:49.563 +and try to inline them as constants. + +00:16:49.663 --> 00:16:51.733 +And this might improve performance, + +00:16:51.734 --> 00:16:53.083 +or maybe not? + +00:16:53.183 --> 00:16:55.367 +Because, we will need a full editor + +00:16:55.368 --> 00:16:58.123 +to get real world data. + +00:16:58.223 --> 00:17:01.733 +I also tried changing cons lists to be backed + +00:17:01.734 --> 00:17:05.243 +by some arrays, because, maybe arrays are faster, I guess? + +00:17:05.343 --> 00:17:09.033 +But in the end, <i>setcdr</i> requires some kind of indirection, + +00:17:09.034 --> 00:17:12.883 +and that actually makes the performance worse. + +00:17:12.983 --> 00:17:14.733 +And for regular expressions, + +00:17:14.734 --> 00:17:17.923 +I also tried borrowing techniques from PCRE JIT, + +00:17:18.023 --> 00:17:20.667 +which is quite fast in itself, but it is + +00:17:20.668 --> 00:17:24.163 +unfortunately unsupported by Java Truffle runtime. + +00:17:24.263 --> 00:17:27.333 +So, looking at these, well, + +00:17:27.334 --> 00:17:30.243 +explorations can fail, certainly. + +00:17:30.343 --> 00:17:32.800 +But, with Truffle and Java, these, + +00:17:32.801 --> 00:17:34.883 +for now, are not that hard to implement, + +00:17:34.983 --> 00:17:37.667 +and also very often, they teach us something + +00:17:37.668 --> 00:17:42.363 +in return, whether or not they fail. + +00:17:42.463 --> 00:17:45.333 +Finally, let's talk about some explorations + +00:17:45.334 --> 00:17:47.883 +that we might get into in the future. + +00:17:47.983 --> 00:17:49.683 +For the JIT engine, for example, + +00:17:49.783 --> 00:17:52.633 +currently I'm looking into the implementation of + +00:17:52.634 --> 00:17:56.883 +nativecomp to maybe reuse some of its optimizations. + +00:17:56.983 --> 00:18:01.323 +For the GUI, I'm very very slowly working on one. + +00:18:01.423 --> 00:18:03.733 +If it ever completes, I have one thing + +00:18:03.734 --> 00:18:06.603 +I'm really looking forward to implementing. + +00:18:06.703 --> 00:18:08.900 +That is, inlining widgets, or even + +00:18:08.901 --> 00:18:11.763 +other buffers, directly into a buffer. + +00:18:11.863 --> 00:18:13.967 +Well, it's because, people sometimes complain + +00:18:13.968 --> 00:18:16.003 +about Emacs's GUI capabilities, + +00:18:16.103 --> 00:18:19.767 +But I personally think that supporting inlining, + +00:18:19.768 --> 00:18:23.043 +like a whole buffer inside another buffer as a rectangle, + +00:18:23.143 --> 00:18:26.883 +could get us very far in layout abilities. + +00:18:26.983 --> 00:18:28.567 +And this approach should also + +00:18:28.568 --> 00:18:30.843 +be compatible with terminals. + +00:18:30.943 --> 00:18:32.933 +And I really want to see how this idea + +00:18:32.934 --> 00:18:36.003 +plays out with Juicemacs. + +00:18:36.103 --> 00:18:38.963 +And of course, there's Lisp concurrency. + +00:18:39.063 --> 00:18:42.167 +And currently i'm thinking of a JavaScript-like, + +00:18:42.168 --> 00:18:46.283 +transparent, single-thread model, using Java's virtual threads. + +00:18:46.383 --> 00:18:49.967 +But anyway, if you are interested in JIT compilation, + +00:18:49.968 --> 00:18:51.663 +Truffle, or anything above, + +00:18:51.763 --> 00:18:53.867 +or maybe you have your own ideas, + +00:18:53.868 --> 00:18:56.283 +you are very welcome to reach out! + +00:18:56.383 --> 00:19:00.033 +Juicemacs does need to implement many more built-in functions, + +00:19:00.034 --> 00:19:03.063 +and any help would be very appreciated. + +00:19:03.163 --> 00:19:05.800 +And I promise, it can be a very fun playground + +00:19:05.801 --> 00:19:08.343 +to learn about Emacs and do crazy things. + +00:19:08.443 --> 00:19:10.902 +Thank you! diff --git a/2025/info/juicemacs-after.md b/2025/info/juicemacs-after.md index e6f0ac4d..ca649a57 100644 --- a/2025/info/juicemacs-after.md +++ b/2025/info/juicemacs-after.md @@ -1,7 +1,422 @@ <!-- Automatically generated by emacsconf-publish-after-page --> -Questions or comments? Please e-mail [kana@iroiro.party](mailto:kana@iroiro.party?subject=Comment%20for%20EmacsConf%202023%20juicemacs%3A%20Juicemacs%3A%20exploring%20speculative%20JIT%20compilation%20for%20ELisp%20in%20Java) +<div class="transcript transcript-mainVideo"><a name="juicemacs-mainVideo-transcript"></a><h1>Transcript</h1> + +[[!template text="""Hello! This is Kana!""" start="00:00:01.200" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And today I'll be talking about""" start="00:00:02.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""<b>J</b>ust-<b>I</b>n-<b>T</b>ime compilation, or JIT,""" start="00:00:04.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""for Emacs Lisp,""" start="00:00:06.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""based on my work-in-progress Emacs clone, Juicemacs.""" start="00:00:07.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Juicemacs aims to explore a few things""" start="00:00:11.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that I've been wondering about for a while.""" start="00:00:13.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For exmaple, what if we had better or even""" start="00:00:15.943" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""transparent concurrency in ELisp?""" start="00:00:18.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Or, can we have a concurrent GUI?""" start="00:00:21.323" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""One that does not block, or is blocked by Lisp code?""" start="00:00:23.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And finally what can JIT compilation do for ELisp?""" start="00:00:26.883" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Will it provide better performance?""" start="00:00:31.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, a main problem with explorations""" start="00:00:34.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in Emacs clones is that,""" start="00:00:37.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Emacs is a whole universe.""" start="00:00:38.723" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And that means, to make these explorations""" start="00:00:40.963" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""meaningful for Emacs users,""" start="00:00:43.601" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we need to cover a lot of Emacs features,""" start="00:00:45.483" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""before we can ever begin.""" start="00:00:47.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For example, one of the features of Emacs is that,""" start="00:00:50.643" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it supports a lot of encodings.""" start="00:00:54.023" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Let's look at this string: it can be encoded""" start="00:00:56.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in both Unicode and Shift-JIS, a Japanese encoding system.""" start="00:00:59.268" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But currently, Unicode does not have""" start="00:01:03.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""an official mapping for this "ki" (﨑) character.""" start="00:01:07.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So when we map from Shift-JIS to Unicode,""" start="00:01:09.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in most programming languages,""" start="00:01:12.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""you end up with something like this:""" start="00:01:14.523" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it's a replacement character.""" start="00:01:16.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But in Emacs, it actually extends""" start="00:01:19.243" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the Unicode range by threefold,""" start="00:01:22.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and uses the extra range to losslessly""" start="00:01:23.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""support characters like this.""" start="00:01:26.834" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So if you want to support this feature,""" start="00:01:29.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that basically rules out all string""" start="00:01:32.023" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""libraries with Unicode assumptions.""" start="00:01:34.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For another, you need to support""" start="00:01:37.843" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the regular expressions in Emacs,""" start="00:01:40.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which are, really irregular.""" start="00:01:41.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For example, it supports asserting""" start="00:01:45.123" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""about the user cursor position.""" start="00:01:46.901" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And it also uses some character tables,""" start="00:01:49.503" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that can be modified from Lisp code,""" start="00:01:52.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to determine to case mappings.""" start="00:01:53.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And all that makes it really hard, or even""" start="00:01:56.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""impossible to use any existing regexp libraries.""" start="00:01:59.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Also, you need a functional garbage collector.""" start="00:02:05.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""You need threading primitives, because""" start="00:02:07.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Emacs has already had some threading support.""" start="00:02:09.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And you might want the performance of your clone""" start="00:02:12.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to match Emacs, even with its native compilation enabled.""" start="00:02:14.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Not to mention you also need a GUI for an editor.""" start="00:02:19.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And so on.""" start="00:02:21.501" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For Juicemacs, building on Java and""" start="00:02:23.643" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""a compiler framework called Truffle,""" start="00:02:25.634" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""helps in getting better performance;""" start="00:02:27.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and by choosing a language with a good GC,""" start="00:02:30.603" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we can actually focus more on the challenges above.""" start="00:02:32.934" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Currently, Juicemacs has implemented three out of,""" start="00:02:38.163" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""at least four of the interpreters in Emacs.""" start="00:02:41.434" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""One for lisp code, one for bytecode,""" start="00:02:44.083" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and one for regular expressions,""" start="00:02:46.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""all of them JIT-capable.""" start="00:02:48.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Other than these, Emacs also has around""" start="00:02:51.003" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""two thousand built-in functions in C code.""" start="00:02:53.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And Juicemacs has around""" start="00:02:56.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""four hundred of them implemented.""" start="00:02:57.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""It's not that many, but it is surprisingly enough""" start="00:02:59.863" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to bootstrap Emacs and run""" start="00:03:03.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the portable dumper, or pdump, in short.""" start="00:03:05.201" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Let's have a try.""" start="00:03:08.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""""" start="00:03:11.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So this is the binary produced by Java native image.""" start="00:03:11.803" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And it's loading all the files""" start="00:03:15.023" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""needed for bootstrapping.""" start="00:03:17.168" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Then it dumps the memory to a file to""" start="00:03:18.863" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""be loaded later, giving us fast startup.""" start="00:03:22.234" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""As we can see here, it throws some frame errors""" start="00:03:25.023" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""because Juicemacs doesn't have an editor UI""" start="00:03:28.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or functional frames yet.""" start="00:03:31.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But otherwise, it can already run""" start="00:03:33.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""quite some lisp code.""" start="00:03:35.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For example, this code uses the benchmark library""" start="00:03:36.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to measure the performance of this Fibonacci function.""" start="00:03:40.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And we can see here, the JIT engine is""" start="00:03:44.503" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""already kicking in and makes the execution faster.""" start="00:03:47.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""In addition to that, with a bit of workaround,""" start="00:03:51.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Juicemacs can also run some of the ERT,""" start="00:03:53.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or, <b>E</b>macs <b>R</b>egression <b>T</b>est suite, that comes with Emacs.""" start="00:03:56.468" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So... Yes, there are a bunch of test failures,""" start="00:04:01.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which means we are not that compatible""" start="00:04:05.923" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""with Emacs and need more work.""" start="00:04:07.934" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But the whole testing procedure runs fine,""" start="00:04:09.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and it has proper stack traces,""" start="00:04:12.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which is quite useful for debugging Juicemacs.""" start="00:04:14.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So with that, a rather functional JIT runtime,""" start="00:04:17.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""let's now try look into today's topic, JIT compilation for ELisp.""" start="00:04:21.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, you probably know that Emacs has supported""" start="00:04:26.083" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""native-compilation, or nativecomp in short, for some time now.""" start="00:04:28.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""It mainly uses GCC to compile Lisp code""" start="00:04:32.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""into native code, ahead of time.""" start="00:04:35.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And during runtime, Emacs loads those compiled files,""" start="00:04:37.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and gets the performance of native code.""" start="00:04:41.434" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, for example, for installed packages,""" start="00:04:44.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we might want to compile them when we""" start="00:04:47.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""actually use them instead of ahead of time.""" start="00:04:49.060" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And Emacs supports this through""" start="00:04:51.923" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""this <i>native-comp-jit-compilation</i> flag.""" start="00:04:53.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""What it does is, during runtime, Emacs sends""" start="00:04:55.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""loaded files to external Emacs worker processes,""" start="00:04:59.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which will then compile those files asynchronously.""" start="00:05:03.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And when the compilation is done,""" start="00:05:07.003" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the current Emacs session will load the compiled code back""" start="00:05:09.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and improves its performance, on the fly.""" start="00:05:11.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""When you look at this procedure, however, it is,""" start="00:05:16.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""ahead-of-time compilation, done at runtime.""" start="00:05:18.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And it is what current Emacs calls JIT compilation.""" start="00:05:21.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But if you look at some other JIT engines,""" start="00:05:25.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""you'll see much more complex architectures.""" start="00:05:27.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, take luaJIT for an example,""" start="00:05:31.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in addition to this red line here,""" start="00:05:34.234" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which leads us from an interpreted state""" start="00:05:36.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to a compiled native state,""" start="00:05:38.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which is also what Emacs does,""" start="00:05:40.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""LuaJIT also supports going from""" start="00:05:42.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""a compiled state back to its interpreter.""" start="00:05:44.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And this process is called "deoptimization".""" start="00:05:47.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""In contrast to its name, deoptimization here actually""" start="00:05:51.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""enables a huge category of JIT optimizations.""" start="00:05:55.301" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""They are called speculation.""" start="00:05:58.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Basically, with speculation, the compiler""" start="00:06:01.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""can use runtime statistics to speculate,""" start="00:06:04.601" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to make bolder assumptions in the compiled code.""" start="00:06:07.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And when the assumptions are invalidated,""" start="00:06:11.543" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the runtime deoptimizes the code, updates statistics,""" start="00:06:14.083" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and then recompile the code based on new assumptions,""" start="00:06:18.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and that will make the code more performant.""" start="00:06:21.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Let's look at an example.""" start="00:06:24.543" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, here is a really simple function,""" start="00:06:28.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that adds one to the input number.""" start="00:06:30.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But in Emacs, it is not that simple,""" start="00:06:33.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""because Emacs has three categories of numbers,""" start="00:06:36.168" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that is, fix numbers, or machine-word-sized integers,""" start="00:06:38.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""floating numbers, and big integers.""" start="00:06:42.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And when we compile this, we need""" start="00:06:45.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to handle all three cases.""" start="00:06:47.601" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And if we analyze the code produced by Emacs,""" start="00:06:49.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""as is shown by this gray graph here,""" start="00:06:52.601" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we can see that it has, two paths:""" start="00:06:54.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""One fast path, that does fast fix number addition;""" start="00:06:58.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and one for slow paths, that calls out""" start="00:07:01.503" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to an external plus-one function,""" start="00:07:03.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to handle floating number and big integers.""" start="00:07:06.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Now, if we pass integers into this function,""" start="00:07:09.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it's pretty fast because it's on the fast path.""" start="00:07:13.168" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, if we pass in a floating number,""" start="00:07:16.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""then it has to go through the slow path,""" start="00:07:19.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""doing an extra function call, which is slow.""" start="00:07:21.943" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""What speculation might help here is that,""" start="00:07:25.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it can have flexible fast paths.""" start="00:07:28.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""When we pass a floating number into this function,""" start="00:07:31.543" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which currently has only fixnumbers on the fast path,""" start="00:07:34.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it also has to go through the slow path.""" start="00:07:37.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But the difference is that, a speculative runtime can""" start="00:07:40.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""deoptimize and recompile the code to adapt to this.""" start="00:07:44.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And when it recompiles, it might add""" start="00:07:47.863" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""floating number onto the fast path,""" start="00:07:50.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and now floating number operations are also fast.""" start="00:07:52.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And this kind of speculation is why""" start="00:07:55.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""speculative runtime can be really fast.""" start="00:07:58.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Let's take a look at some benchmarks.""" start="00:08:03.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""They're obtained with the <i>elisp-benchmarks</i> library on ELPA.""" start="00:08:05.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""The blue line here is for nativecomp,""" start="00:08:09.523" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and these blue areas mean that nativecomp is slower.""" start="00:08:12.601" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And, likewise, green areas mean that""" start="00:08:16.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Juicemacs is slower.""" start="00:08:19.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""At a glance, the two (or four)""" start="00:08:20.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""actually seems somehow on par, to me.""" start="00:08:22.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But, let's take a closer look at some of them.""" start="00:08:25.243" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, the first few benchmarks are the classic,""" start="00:08:30.483" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Fibonacci benchmarks.""" start="00:08:32.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""We know that, the series is formed by""" start="00:08:34.083" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""adding the previous two numbers in the series.""" start="00:08:36.934" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And looking at this expression here,""" start="00:08:39.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Fibonacci benchmarks are quite intensive""" start="00:08:41.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in number additions, subtractions,""" start="00:08:44.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and function calls, if you use recursions.""" start="00:08:46.801" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And it is exactly why""" start="00:08:49.203" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Fibonacci series is a good benchmark.""" start="00:08:51.001" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And looking at the results here... wow.""" start="00:08:54.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Emacs nativecomp executes instantaneously.""" start="00:08:57.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""It's a total defeat for Juicemacs, seemingly.""" start="00:08:59.943" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Now, if you're into benchmarks, you know something is wrong here:""" start="00:09:04.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we are comparing the different things.""" start="00:09:08.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So let's look under the hood""" start="00:09:11.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and disassemble the function""" start="00:09:14.201" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""with this convenient Emacs command""" start="00:09:15.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""called <i>disassemble</i>...""" start="00:09:17.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And these two lines of code is what we got.""" start="00:09:19.163" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, we already can see""" start="00:09:23.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""what's going on here:""" start="00:09:24.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""GCC sees Fibonacci is a pure function,""" start="00:09:26.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""because it returns the same value""" start="00:09:30.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""for the same arguments,""" start="00:09:31.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""so GCC chooses to do the computation""" start="00:09:33.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""at compile time""" start="00:09:35.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and inserts the final number directly""" start="00:09:36.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""into the compiled code.""" start="00:09:39.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""It is actually great!""" start="00:09:41.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Because it shows that nativecomp""" start="00:09:43.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""knows about pure functions,""" start="00:09:45.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and can do all kinds of things""" start="00:09:47.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""like removing or constant-folding them.""" start="00:09:48.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And Juicemacs just does not do that.""" start="00:09:51.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, we are also concerned about""" start="00:09:54.503" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the things we mentioned earlier:""" start="00:09:57.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the performance of number additions,""" start="00:09:59.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or function calls.""" start="00:10:00.901" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, in order to let the benchmarks""" start="00:10:03.083" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""show some extra things,""" start="00:10:05.634" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we need to modify it a bit...""" start="00:10:06.963" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""by simply making things non-constant.""" start="00:10:08.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""With that, Emacs gets much slower now.""" start="00:10:11.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And again, let's look what's""" start="00:10:15.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""happening behind these numbers.""" start="00:10:17.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Similarly, with the <i>disassemble</i> command,""" start="00:10:21.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we can look into the assembly.""" start="00:10:23.501" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And again, we can already see""" start="00:10:25.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""what's happening here.""" start="00:10:28.020" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, Juicemacs, due to its speculation nature,""" start="00:10:29.403" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""supports fast paths for all three kind of numbers.""" start="00:10:32.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, currently, Emacs nativecomp""" start="00:10:35.543" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""does not have any fast path""" start="00:10:39.234" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""for the operations here like additions,""" start="00:10:41.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or subtractions, or comparisons,""" start="00:10:43.434" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which is exactly what""" start="00:10:45.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Fibonacci benchmarks are measuring.""" start="00:10:48.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Emacs, at this time, has to call some generic,""" start="00:10:51.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""external functions for them, and this is slow.""" start="00:10:53.801" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But is nativecomp really that slow?""" start="00:11:00.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, I also ran the same benchmark""" start="00:11:03.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in Common Lisp, with SBCL.""" start="00:11:04.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And nativecomp is already fast,""" start="00:11:07.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""compared to untyped SBCL.""" start="00:11:09.001" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""It's because SBCL also emits call instructions""" start="00:11:11.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""when it comes to no type info.""" start="00:11:15.501" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, once we declare the types,""" start="00:11:18.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""SBCL is able to compile a fast path for fix numbers,""" start="00:11:21.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which makes its performance on par""" start="00:11:25.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""with speculative JIT engines (that is, Juicemacs),""" start="00:11:27.468" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""because, now both of us are now on fast paths.""" start="00:11:30.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Additionally, if we are bold enough""" start="00:11:36.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to pass this safety zero flag to SBCL,""" start="00:11:38.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it will remove all the slow paths""" start="00:11:41.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and type checks,""" start="00:11:43.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and its performance is close""" start="00:11:45.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to what you get with C.""" start="00:11:46.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Well, probably we don't want safety zero""" start="00:11:48.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""most of the time.""" start="00:11:51.300" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But even then, if nativecomp were to""" start="00:11:52.163" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""get fast paths for more constructs,""" start="00:11:55.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""there certainly is quite""" start="00:11:57.863" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""some room for performance improvement.""" start="00:11:59.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Let's look at some more benchmarks.""" start="00:12:04.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For example, for this inclist,""" start="00:12:06.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or increment-list, benchmark,""" start="00:12:08.934" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Juicemacs is really slow here. Partly,""" start="00:12:11.023" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it comes from the cost of Java boxing integers.""" start="00:12:14.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""On the other hand, for Emacs nativecomp,""" start="00:12:17.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""for this particular benchmark,""" start="00:12:20.301" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it actually has fast paths""" start="00:12:22.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""for all of the operations.""" start="00:12:23.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And that's why it can be so fast,""" start="00:12:25.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and that also proves the nativecomp""" start="00:12:27.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""has a lot potential for improvement.""" start="00:12:30.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""There is another benchmark here""" start="00:12:33.943" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that use advices.""" start="00:12:35.834" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So Emacs Lisp supports using""" start="00:12:38.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""advices to override functions""" start="00:12:40.501" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""by wrapping the original function, and an advice""" start="00:12:42.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""function, two of them, inside a glue function.""" start="00:12:44.834" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And in this benchmark, we advice the Fibonacci function""" start="00:12:47.543" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to cache the first ten entries to speed up computation,""" start="00:12:51.468" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""as can be seen in the speed-up in the Juicemacs results.""" start="00:12:54.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""However, it seems that nativecomp does not yet""" start="00:13:00.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""compile glue functions, and that makes advices slower.""" start="00:13:02.901" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""With these benchmarks, let's discuss this big question:""" start="00:13:08.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Should GNU Emacs adopt speculative JIT compilation?""" start="00:13:12.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Well, the hidden question is actually,""" start="00:13:16.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""is it worth it?""" start="00:13:18.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And, my personal answer is, maybe not.""" start="00:13:21.323" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""The first reason is that, slow paths, like, floating numbers,""" start="00:13:24.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""are actually not that frequent in Emacs.""" start="00:13:28.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And optimizing for fast paths like fix numbers""" start="00:13:31.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""can already get us very good performance already.""" start="00:13:34.101" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And the second or main reason is that,""" start="00:13:38.083" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""speculative JIT is very hard.""" start="00:13:40.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""LuaJIT, for example, took a genius to build.""" start="00:13:43.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Even with the help of GCC, we need to hand-write""" start="00:13:46.943" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""all those fast path or slow path or switching logic.""" start="00:13:50.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""We need to find a way to deoptimize, which requires""" start="00:13:54.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""mapping machine registers back to interpreter stack.""" start="00:13:58.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And also, speculation needs runtime info,""" start="00:14:01.903" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which also costs us extra memory.""" start="00:14:04.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Moreover, as is shown by some benchmarks above,""" start="00:14:07.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""there's some low-hanging fruits in nativecomp that""" start="00:14:10.863" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""might get us better performance with relatively lower effort.""" start="00:14:13.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Compared to this, a JIT engine is a huge, huge undertaking.""" start="00:14:17.443" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But, for Juicemacs, the JIT engine comes a lot cheaper,""" start="00:14:22.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""because, we are cheating by building on""" start="00:14:26.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""an existing compiler framework called Truffle.""" start="00:14:29.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Truffle is a meta-compiler framework,""" start="00:14:33.543" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which means that it lets you write""" start="00:14:35.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""an interpreter, add required annotations,""" start="00:14:37.634" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and it will automatically turn the""" start="00:14:40.203" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""interpreter into a JIT runtime.""" start="00:14:42.501" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So for example, here is a typical bytecode interpreter.""" start="00:14:45.743" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""After you add the required annotations,""" start="00:14:49.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Truffle will know that,""" start="00:14:51.234" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the bytecode here is constant, and it should""" start="00:14:52.623" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""unroll this loop here, to inline all those bytecode.""" start="00:14:55.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And then, when Truffle""" start="00:14:59.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""compiles the code, it knows that:""" start="00:15:00.468" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""the first loop here does: x plus one,""" start="00:15:02.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and the second does: return.""" start="00:15:05.234" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And then it will compile all that into,""" start="00:15:07.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""return x plus 1,""" start="00:15:09.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which is exactly what we would expect""" start="00:15:11.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""when compiling this pseudo code.""" start="00:15:14.068" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Building on that, we can also easily implement speculation,""" start="00:15:17.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""by using this <i>transferToInterpreterAndInvalidate</i> function""" start="00:15:21.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""provided by Truffle.""" start="00:15:24.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And Truffle will automatically turn that""" start="00:15:26.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""into deoptimization.""" start="00:15:28.534" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Now, for example, when this add function""" start="00:15:30.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""is supplied with, two floating numbers.""" start="00:15:32.701" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""It will go through the slow path here,""" start="00:15:35.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which might lead to a compiled slow path,""" start="00:15:38.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or deoptimization.""" start="00:15:40.961" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And going this deoptimization way,""" start="00:15:43.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""it can then update the runtime stats.""" start="00:15:45.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And now, when the code is compiled again,""" start="00:15:48.323" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Truffle will know,""" start="00:15:50.401" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that these compilation stats, suggests that,""" start="00:15:51.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""we have floating numbers.""" start="00:15:54.101" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And this floating point addition branch will""" start="00:15:55.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""then be incorporated into the fast path.""" start="00:15:58.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""To put it into Java code...""" start="00:16:02.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Most operations are just as simple as this.""" start="00:16:06.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And it supports fast paths for integers,""" start="00:16:08.823" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""floating numbers, and big integers.""" start="00:16:11.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And the simplicity of this not only saves us work,""" start="00:16:14.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""but also enables Juicemacs to explore more things more rapidly.""" start="00:16:17.134" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And actually, I have done some silly explorations.""" start="00:16:22.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For example, I tried to constant-fold more things.""" start="00:16:26.583" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Many of us have an Emacs config that stays""" start="00:16:30.303" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""largely unchanged, at least during one Emacs session.""" start="00:16:32.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And that means many of the global variables""" start="00:16:36.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in ELisp are constant.""" start="00:16:39.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And with speculation, we can""" start="00:16:42.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""speculate about the stable ones,""" start="00:16:44.601" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and try to inline them as constants.""" start="00:16:46.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And this might improve performance,""" start="00:16:49.663" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or maybe not?""" start="00:16:51.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Because, we will need a full editor""" start="00:16:53.183" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to get real world data.""" start="00:16:55.368" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""I also tried changing cons lists to be backed""" start="00:16:58.223" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""by some arrays, because, maybe arrays are faster, I guess?""" start="00:17:01.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But in the end, <i>setcdr</i> requires some kind of indirection,""" start="00:17:05.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and that actually makes the performance worse.""" start="00:17:09.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And for regular expressions,""" start="00:17:12.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""I also tried borrowing techniques from PCRE JIT,""" start="00:17:14.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""which is quite fast in itself, but it is""" start="00:17:18.023" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""unfortunately unsupported by Java Truffle runtime.""" start="00:17:20.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""So, looking at these, well,""" start="00:17:24.263" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""explorations can fail, certainly.""" start="00:17:27.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But, with Truffle and Java, these,""" start="00:17:30.343" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""for now, are not that hard to implement,""" start="00:17:32.801" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and also very often, they teach us something""" start="00:17:34.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""in return, whether or not they fail.""" start="00:17:37.668" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Finally, let's talk about some explorations""" start="00:17:42.463" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""that we might get into in the future.""" start="00:17:45.334" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For the JIT engine, for example,""" start="00:17:47.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""currently I'm looking into the implementation of""" start="00:17:49.783" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""nativecomp to maybe reuse some of its optimizations.""" start="00:17:52.634" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""For the GUI, I'm very very slowly working on one.""" start="00:17:56.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""If it ever completes, I have one thing""" start="00:18:01.423" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""I'm really looking forward to implementing.""" start="00:18:03.734" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""That is, inlining widgets, or even""" start="00:18:06.703" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""other buffers, directly into a buffer.""" start="00:18:08.901" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Well, it's because, people sometimes complain""" start="00:18:11.863" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""about Emacs's GUI capabilities,""" start="00:18:13.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But I personally think that supporting inlining,""" start="00:18:16.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""like a whole buffer inside another buffer as a rectangle,""" start="00:18:19.768" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""could get us very far in layout abilities.""" start="00:18:23.143" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And this approach should also""" start="00:18:26.983" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""be compatible with terminals.""" start="00:18:28.568" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And I really want to see how this idea""" start="00:18:30.943" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""plays out with Juicemacs.""" start="00:18:32.934" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And of course, there's Lisp concurrency.""" start="00:18:36.103" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And currently i'm thinking of a JavaScript-like,""" start="00:18:39.063" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""transparent, single-thread model, using Java's virtual threads.""" start="00:18:42.168" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""But anyway, if you are interested in JIT compilation,""" start="00:18:46.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Truffle, or anything above,""" start="00:18:49.968" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""or maybe you have your own ideas,""" start="00:18:51.763" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""you are very welcome to reach out!""" start="00:18:53.868" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Juicemacs does need to implement many more built-in functions,""" start="00:18:56.383" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""and any help would be very appreciated.""" start="00:19:00.034" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""And I promise, it can be a very fun playground""" start="00:19:03.163" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""to learn about Emacs and do crazy things.""" start="00:19:05.801" video="mainVideo-juicemacs" id="subtitle"]] +[[!template text="""Thank you!""" start="00:19:08.443" video="mainVideo-juicemacs" id="subtitle"]] + +</div>Questions or comments? Please e-mail [kana@iroiro.party](mailto:kana@iroiro.party?subject=Comment%20for%20EmacsConf%202023%20juicemacs%3A%20Juicemacs%3A%20exploring%20speculative%20JIT%20compilation%20for%20ELisp%20in%20Java) <!-- End of emacsconf-publish-after-page --> diff --git a/2025/info/juicemacs-before.md b/2025/info/juicemacs-before.md index 5ea71d1c..d41ff742 100644 --- a/2025/info/juicemacs-before.md +++ b/2025/info/juicemacs-before.md @@ -8,12 +8,12 @@ The following image shows where the talk is in the schedule for Sat 2025-12-06. Format: 20-min talk ; Q&A: Etherpad <https://pad.emacsconf.org/2025-juicemacs> Etherpad: <https://pad.emacsconf.org/2025-juicemacs> Discuss on IRC: [#emacsconf-dev](https://chat.emacsconf.org/?join=emacsconf,emacsconf-dev) -Status: Ready to stream +Status: Now playing on the conference livestream <div>Times in different time zones:</div><div class="times" start="2025-12-06T15:15:00Z" end="2025-12-06T15:35:00Z"><div class="conf-time">Saturday, Dec 6 2025, ~10:15 AM - 10:35 AM EST (US/Eastern)</div><div class="others"><div>which is the same as:</div>Saturday, Dec 6 2025, ~9:15 AM - 9:35 AM CST (US/Central)<br />Saturday, Dec 6 2025, ~8:15 AM - 8:35 AM MST (US/Mountain)<br />Saturday, Dec 6 2025, ~7:15 AM - 7:35 AM PST (US/Pacific)<br />Saturday, Dec 6 2025, ~3:15 PM - 3:35 PM UTC <br />Saturday, Dec 6 2025, ~4:15 PM - 4:35 PM CET (Europe/Paris)<br />Saturday, Dec 6 2025, ~5:15 PM - 5:35 PM EET (Europe/Athens)<br />Saturday, Dec 6 2025, ~8:45 PM - 9:05 PM IST (Asia/Kolkata)<br />Saturday, Dec 6 2025, ~11:15 PM - 11:35 PM +08 (Asia/Singapore)<br />Sunday, Dec 7 2025, ~12:15 AM - 12:35 AM JST (Asia/Tokyo)</div></div><div><strong><a href="/2025/watch/dev/">Find out how to watch and participate</a></strong></div> - +<div class="vid mainVideo"><video controls preload="none" id="mainVideo-juicemacs"><source src="https://media.emacsconf.org/2025/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.webm" />captions="""<track label="English" kind="captions" srclang="en" src="/2025/captions/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.vtt" default />"""<p><em>Your browser does not support the video tag. Please download the video instead.</em></p></video><div></div>Duration: 19:10 minutes<div class="files resources"><ul><li><a href="https://pad.emacsconf.org/2025-juicemacs">Open Etherpad</a></li><li><a href="https://pad.emacsconf.org/2025-juicemacs">Open public Q&A</a></li><li><a href="https://media.emacsconf.org/2025/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--intro.vtt">Download --intro.vtt</a></li><li><a href="https://media.emacsconf.org/2025/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--intro.webm">Download --intro.webm</a></li><li><a href="https://media.emacsconf.org/2025/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.opus">Download --main.opus (17MB)</a></li><li><a href="https://media.emacsconf.org/2025/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.vtt">Download --main.vtt</a></li><li><a href="https://media.emacsconf.org/2025/emacsconf-2025-juicemacs--juicemacs-exploring-speculative-jit-compilation-for-elisp-in-java--kana--main.webm">Download --main.webm (38MB)</a></li><li><a href="https://youtu.be/Lm-a7eZO5jk">View on Youtube</a></li></ul></div></div> # Description <!-- End of emacsconf-publish-before-page -->
\ No newline at end of file |
