1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
|
[[!meta title="Optimizing Emacs Lisp Code"]]
[[!meta copyright="Copyright © 2021 Dmitry Gutov"]]
[[!inline pages="internal(2021/info/faster-nav)" raw="yes"]]
<!-- You can manually edit this file to update the abstract, add links, etc. --->
# Optimizing Emacs Lisp Code
Dmitry Gutov
[[!inline pages="internal(2021/info/faster-schedule)" raw="yes"]]
[[!table header="no" class="speaker-details" data="""
Name pronunciation: | d-MEET-ri GOO-tov
Pronouns: | he/his
Homepage: | <https://github.com/dgutov/>
Preferred contact info | <dgutov@yandex.ru>
"""]]
- Before optimizing, benchmark first.
- Different benchmarking approaches.
- Live evaluation, step-debugging, measuring from a debugger breakpoint.
- How to determine if a function is expensive. How to pick one from
competing alternatives (cl-lib, seq, dash, lean core).
- Print-benchmarking.
- Byte-compiled code can give a very different picture, changing where
the bottleneck is. How to quickly load a byte-compiled version.
- Steps taken to speed up the Xref package recently.
# Discussion
IRC nick: dgutov
Pad:
- Q1: Why are overlays slow compared to text-properties? I (believe
to) understand that it is (in part only?) due to \"get element n in
vector vs list\". If so, then why don\'t we change that? There could
be a text-property called \"overlays\", so that lookup would also be
like in a vector. What benefits does the datastructure currently
used for overlays have that make that undesirable? Would a mixed
approach make sense; i.e. allow non-modifiyng lookups to use the
\"cached\" overlays that are stored in the \"overlay\" text-property
and make text-inserting and overlay-moving actions store in the
currently used datastructure as well as in the indirect
text-property=\>overlay cache?
- A: \"There is a pending patch to represent the set of a
buffer\'s overlays as an AAtree or somesuch..\"
- Sounds promising :)
- For more details, check out these threads:
- <https://lists.gnu.org/archive/html/emacs-devel/2014-09/msg00616.html>
- <https://lists.gnu.org/archive/html/emacs-devel/2016-11/msg00475.html>
- <https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00565.html>
- <https://lists.gnu.org/archive/html/emacs-devel/2019-12/msg00115.html>
- Q2: As a non-programmer, would these sorts of optimizations be
helpful to do on a personal init.el file?
- A: Probably not
- Though too much mode-line customisation may slow things down.
- Q3: What\'s a good approach for benchmarking destructive
operations? If you delete elements from a list in-place, all
subsequent runs will be artificially fast.
- A: There is an example of a comparison between operations from
different libraries in the example file provided by the talk.
Particularly check the benchmarks for delete and remove
operations (destructive and non-destructive, respectively).
- Q4:Cl-lib constructors, getters, and setters usually expand into
multiple levels of let-bindings. AFAIU, every let-binding is an
extra memory allocation. Do you recommend avoiding cl-defstruct in
favour of \"pure\" lists/vectors?
- A: basically no. if defstruct allows you to organise better, go
ahead.
- Q5: Is it possible to optimize some emacs packages by making use of
code compiled from other languages (like C or Common Lisp) ((i.e. in
the same way python is able to import C code))?
- A: yes emacs modules allow you to run C or Rust, transitioning
between emacs proper and module (passing the data) might slow
things down? Because of copying that\'s necessary to avoid
issues with gc.
- Q6:You mentioned that overlays are much slower compared to text
properties. What about text properties vs. buffer-local variables to
store position cache?
- A: I haven\'t measured it but offhand I\'m going to guess that
buffer-local variables will be faster.
- Also depends on the structure you\'re going to use for the
cache - is it a single cons, or a list, or a tree, etc.
BBB:
- AVL tree
- defstruct accessors should expand with compiler macros to aref calls, which are very fast
- They have extra if though
- oh you mean for testing whether the value is such a struct?
- yes there is that test, but I wouldn't expect that to make it 3x slower, AFAIK
IRC:
- If somebody wants to do a remote session with me: I do have processes such as updating column view dynamic blocks that take maybe 40 minutes. So far, I avoid executing those functions when I'm around the computer myself. However, there may be room for improvement and I really can't tell wether it is in my personal setup or not because it's not always that easy to re-create a use-case with plain Emacs cnofig
- Thanks for doing this talk. FYI you might find the this bench-multi-lexical macro useful: https://alphapapa.github.io/emacs-package-dev-handbook/#outline-container-Optimization
- dgutov: I can't seem to find the exact macro you are referring to. But if it covers a use case benchmark-progn does not, consider contributing it to benchmark.el in the core.
- Sorry, try this link directly to that macro: https://github.com/alphapapa/emacs-package-dev-handbook#bench-multi-lexical The purpose of the macro is to compare different forms and show how they perform relative to each other
- dgutov: Ah yeah, that looks pretty cool. Less sure about the org format, but it must be nice for presentations.
- The Org format is good for documentation too. But it just uses the output of benchmark-run, so it could easily be left in Lisp form. :)
- dgutov: These things are really handy to have available in 'emacs -Q', though. When you're working on shaving some extra handful of percents.
- Yes, a few lines of code could be added to run the compiled code in a separate Emacs process.
- https://github.com/alphapapa/emacs-package-dev-handbook compares some common ways to do common things in Elisp so you can see which is generally faster, e.g. https://github.com/alphapapa/emacs-package-dev-handbook#inserting-strings
- PSA: buffer-local-value is generally much faster than with-current-buffer if all you need to do is get the value of a variable in a buffer
- For more info about the performance of overlays vs text properties data structure, there's an Emacs TODO about it. C-h C-t and search for "Move overlays to intervals.c".
- cl-defstruct getters/setters have compiler macros that expand into simple aref calls on vectors, they are very efficient
Links:
- you might find the this bench-multi-lexical macro useful:
<https://alphapapa.github.io/emacs-package-dev-handbook/#outline-container-Optimization>
or
<https://github.com/alphapapa/emacs-package-dev-handbook#bench-multi-lexical>
- <https://git.savannah.gnu.org/cgit/emacs.git/tree/lisp/emacs-lisp/elp.el>
- <https://git.savannah.gnu.org/cgit/emacs.git/tree/lisp/emacs-lisp/benchmark.el>
- \"Use hash tables kids!\"
- PSA: buffer-local-value is generally much faster than
with-current-buffer if all you need to do is get the value of a
variable in a buffer
- EIEIO\'s object construction is slow because it goes through
\`make-instance\` which is a generic function and it itself calls
various other generic functions, so there\'s a lot of cl-generic
dispatch overhead; and then there\'s the fact that the (keyword)
arguments are laboriously parsed at run-time so it itself is slow as
well.
- There is a pending patch to represent the set of a buffer\'s
overlays as an AAtree or somesuch.
- <https://media.emacsconf.org/2021/emacsconf-2021-faster--optimizing-emacs-lisp-code--dmitry-gutov.el>
[[!inline pages="internal(2021/captions/faster)" raw="yes"]]
[[!inline pages="internal(2021/info/faster-nav)" raw="yes"]]
|