[[!meta title="Optimizing Emacs Lisp Code"]] [[!meta copyright="Copyright © 2021 Dmitry Gutov"]] [[!inline pages="internal(2021/info/faster-nav)" raw="yes"]] # Optimizing Emacs Lisp Code Dmitry Gutov [[!inline pages="internal(2021/info/faster-schedule)" raw="yes"]] [[!table header="no" class="speaker-details" data=""" Name pronunciation: | d-MEET-ri GOO-tov Pronouns: | he/his Homepage: | Preferred contact info | """]] - Before optimizing, benchmark first. - Different benchmarking approaches. - Live evaluation, step-debugging, measuring from a debugger breakpoint. - How to determine if a function is expensive. How to pick one from competing alternatives (cl-lib, seq, dash, lean core). - Print-benchmarking. - Byte-compiled code can give a very different picture, changing where the bottleneck is. How to quickly load a byte-compiled version. - Steps taken to speed up the Xref package recently. # Discussion IRC nick: dgutov Pad: - Q1: Why are overlays slow compared to text-properties? I (believe to) understand that it is (in part only?) due to \"get element n in vector vs list\". If so, then why don\'t we change that? There could be a text-property called \"overlays\", so that lookup would also be like in a vector. What benefits does the datastructure currently used for overlays have that make that undesirable? Would a mixed approach make sense; i.e. allow non-modifiyng lookups to use the \"cached\" overlays that are stored in the \"overlay\" text-property and make text-inserting and overlay-moving actions store in the currently used datastructure as well as in the indirect text-property=\>overlay cache? - A: \"There is a pending patch to represent the set of a buffer\'s overlays as an AAtree or somesuch..\" - Sounds promising :) - For more details, check out these threads: - - - - - Q2: As a non-programmer, would these sorts of optimizations be helpful to do on a personal init.el file? - A: Probably not - Though too much mode-line customisation may slow things down. - Q3: What\'s a good approach for benchmarking destructive operations?  If you delete elements from a list in-place, all subsequent runs will be artificially fast. - A: There is an example of a comparison between operations from different libraries in the example file provided by the talk. Particularly check the benchmarks for delete and remove operations (destructive and non-destructive, respectively). - Q4:Cl-lib constructors, getters, and setters usually expand into multiple levels of let-bindings. AFAIU, every let-binding is an extra memory allocation. Do you recommend avoiding cl-defstruct in favour of \"pure\" lists/vectors? - A: basically no. if defstruct allows you to organise better, go ahead. - Q5: Is it possible to optimize some emacs packages by making use of code compiled from other languages (like C or Common Lisp) ((i.e. in the same way python is able to import C code))? - A: yes emacs modules allow you to run C or Rust, transitioning between emacs proper and module (passing the data) might slow things down? Because of copying that\'s necessary to avoid issues with gc. - Q6:You mentioned that overlays are much slower compared to text properties. What about text properties vs. buffer-local variables to store position cache? - A: I haven\'t measured it but offhand I\'m going to guess that buffer-local variables will be faster. - Also depends on the structure you\'re going to use for the cache - is it a single cons, or a list, or a tree, etc. BBB: - AVL tree - defstruct accessors should expand with compiler macros to aref calls, which are very fast - They have extra if though - oh you mean for testing whether the value is such a struct? - yes there is that test, but I wouldn't expect that to make it 3x slower, AFAIK IRC: - If somebody wants to do a remote session with me: I do have processes such as updating column view dynamic blocks that take maybe 40 minutes. So far, I avoid executing those functions when I'm around the computer myself. However, there may be room for improvement and I really can't tell wether it is in my personal setup or not because it's not always that easy to re-create a use-case with plain Emacs cnofig - Thanks for doing this talk. FYI you might find the this bench-multi-lexical macro useful: https://alphapapa.github.io/emacs-package-dev-handbook/#outline-container-Optimization - dgutov: I can't seem to find the exact macro you are referring to. But if it covers a use case benchmark-progn does not, consider contributing it to benchmark.el in the core. - Sorry, try this link directly to that macro: https://github.com/alphapapa/emacs-package-dev-handbook#bench-multi-lexical The purpose of the macro is to compare different forms and show how they perform relative to each other - dgutov: Ah yeah, that looks pretty cool. Less sure about the org format, but it must be nice for presentations. - The Org format is good for documentation too. But it just uses the output of benchmark-run, so it could easily be left in Lisp form. :) - dgutov: These things are really handy to have available in 'emacs -Q', though. When you're working on shaving some extra handful of percents. - Yes, a few lines of code could be added to run the compiled code in a separate Emacs process. - https://github.com/alphapapa/emacs-package-dev-handbook compares some common ways to do common things in Elisp so you can see which is generally faster, e.g. https://github.com/alphapapa/emacs-package-dev-handbook#inserting-strings - PSA: buffer-local-value is generally much faster than with-current-buffer if all you need to do is get the value of a variable in a buffer - For more info about the performance of overlays vs text properties data structure, there's an Emacs TODO about it. C-h C-t and search for "Move overlays to intervals.c". - cl-defstruct getters/setters have compiler macros that expand into simple aref calls on vectors, they are very efficient Links: - you might find the this bench-multi-lexical macro useful: or   - - - \"Use hash tables kids!\" - PSA: buffer-local-value is generally much faster than with-current-buffer if all you need to do is get the value of a variable in a buffer  - EIEIO\'s object construction is slow because it goes through \`make-instance\` which is a generic function and it itself calls various other generic functions, so there\'s a lot of cl-generic dispatch overhead; and then there\'s the fact that the (keyword) arguments are laboriously parsed at run-time so it itself is slow as well. - There is a pending patch to represent the set of a buffer\'s overlays as an AAtree or somesuch. - [[!inline pages="internal(2021/captions/faster)" raw="yes"]] [[!inline pages="internal(2021/info/faster-nav)" raw="yes"]]