blob: c1bea8de622cf7eaf608363861800d84e3a3f045 (
plain) (
tree)
|
|
WEBVTT
00:00.000 --> 00:06.480
Hello everyone, my name is Igor Achinko and you may know me from Org Mailing List.
00:07.440 --> 00:11.760
However, today I'm not going to talk about Org Mode. Today I'm going to talk about
00:11.760 --> 00:16.800
Emacs performance and how it's affected by its memory management code.
00:18.880 --> 00:24.720
First, I will introduce the basic concepts of Emacs memory management and what garbage
00:24.720 --> 00:32.320
collection is. Then I will show you user statistics collected from volunteer users
00:32.320 --> 00:42.080
over the last half year and I will end with some guidelines on how to tweak Emacs garbage
00:42.080 --> 00:48.640
collection customizations to optimize Emacs performance and when it's necessary or not
00:49.120 --> 00:56.560
to do. Let's begin. What is garbage collection? To understand what is garbage collection we need
00:56.560 --> 01:01.920
to realize that anything you do in Emacs is some kind of command and any command is most likely
01:01.920 --> 01:07.280
running some Elisp code and every time you run Elisp code you most likely need to locate certain
01:07.280 --> 01:14.160
memory in RAM and some of this memory is retained for a long time and some of this memory is
01:14.160 --> 01:20.320
transient. Of course, Emacs has to clear this transient memory from time to time to not occupy
01:20.320 --> 01:27.200
all the possible RAM in the computer. In this small example we have one global variable
01:28.480 --> 01:35.600
that is assigned a value but when assigning the value we first allocate a temporary variable
01:35.600 --> 01:41.360
and then a temporary list and only retain some part of this list in this global variable.
01:42.240 --> 01:51.920
In terms of memory graph we can represent this as two variable slots, one transient, one permanent
01:52.480 --> 02:01.680
and then a list of three concerns and part of which is retained as a global variable but part
02:01.680 --> 02:07.280
of it which is a temporary variable symbol and the first term of the list is not used and it
02:07.840 --> 02:15.040
might be cleared at some point. So that's what Emacs does. Every now and then Emacs goes through
02:15.040 --> 02:20.320
all the memory and identify which part of the memory are not used and then clear them so that
02:20.320 --> 02:27.760
it can free up the RAM. This process is called garbage collection and Emacs uses a very simple
02:27.760 --> 02:33.440
and old algorithm which is called mark and sweep. So during this mark and sweep process
02:33.440 --> 02:40.880
is basically two stages. First Emacs scans all the memory that is allocated and then identify
02:40.880 --> 02:46.320
which memory is still in use which is linked to some variables for example and which memory is
02:46.320 --> 02:51.600
not used anymore even though it was allocated in the past and the second stage is clear that
02:51.600 --> 02:56.240
whatever a memory is not that is not allocated. During the process
02:56.880 --> 03:03.920
Emacs cannot do anything now. So basically every time Emacs scans the memory it freezes up and
03:03.920 --> 03:09.840
doesn't respond to anything and if it takes too much time so that users can notice it then of
03:09.840 --> 03:18.160
course Emacs is not responsive at all and if this garbage collection is triggered too frequently
03:18.160 --> 03:23.760
then it's not just not responsive every now and then it's also not responsive all the time almost
03:24.000 --> 03:29.840
all the time so it cannot even normally type or stuff or do some normal commands.
03:32.320 --> 03:40.080
This mark and sweep algorithm is taking longer the more memory Emacs uses. So basically
03:40.080 --> 03:46.480
the more buffers you open, the more packages you load, the more complex commands you run,
03:46.480 --> 03:55.840
the more memory is used and basically the longer Emacs takes to perform a single garbage collection.
04:00.560 --> 04:07.280
Of course Emacs being Emacs and this garbage collection can be tweaked. In particular
04:07.280 --> 04:12.960
users can tweak how frequently Emacs does garbage collection using two basic variables
04:12.960 --> 04:19.840
GCConsThreshold and GCConsPercentage. GCConsThreshold is the raw number of kilobytes
04:21.440 --> 04:27.200
Emacs needs to allocate before triggering another garbage collection and the GCConsPercentage
04:27.200 --> 04:31.680
is similar but it's defined in terms of fraction of already allocated memory.
04:33.840 --> 04:41.840
If you follow various Emacs forums you may be familiar with people complaining about
04:41.840 --> 04:47.760
garbage collection and there are many many suggestions about what to do with it.
04:50.320 --> 04:52.640
Most frequently you see GCConsThreshold
04:54.640 --> 05:01.280
recommended to be increased and a number of pre-packaged Emacs distributions like
05:01.280 --> 05:07.280
DoMeEmacs do increase it or like I have seen suggestions which are actually horrible to
05:07.280 --> 05:11.120
disable garbage collection temporarily or for a long time.
05:14.240 --> 05:19.600
Which is nice you can see it quite frequently which indicates there might be some problem.
05:19.600 --> 05:26.320
However every time like one user poses about this problem it's just one data point and it doesn't
05:26.320 --> 05:30.000
mean that everyone actually suffers from it. It doesn't mean that everyone should do it.
05:30.720 --> 05:37.680
So in order to understand if this garbage collection is really a problem which is a
05:37.680 --> 05:48.000
common problem we do need some kind of statistics and only using the actual statistics we can
05:48.000 --> 05:54.880
understand if it should be recommended for everyone to tweak the defaults or like whether
05:54.880 --> 06:00.000
it should be recommended for certain users or maybe it should be asked Emacs devs to do
06:00.000 --> 06:08.800
something about the defaults. And what I did some time ago is exactly this. I tried to collect the
06:08.800 --> 06:18.000
user statistics. So I wrote a small package on Elp and some users installed this package and
06:18.000 --> 06:24.080
then reported back these statistics of the garbage collection for their particular use.
06:25.360 --> 06:33.840
By now we have obtained 129 user submissions with over 1 million GC records in there.
06:35.760 --> 06:42.320
So like some of these submissions used default GC settings without any customizations.
06:42.320 --> 06:47.040
Some used increased GC cost threshold and GC cost percentage.
06:48.880 --> 06:56.640
So using this data we can try to draw some reliable conclusions on what should be done
06:56.640 --> 07:02.480
and whether should anything be done about garbage collection on Emacs dev level or at least on user
07:02.480 --> 07:08.240
level. Of course we need to keep in mind that there's some kind of bias because it's more
07:08.240 --> 07:13.680
likely that users already have problems with GC or they think they have problems with GC will report
07:14.480 --> 07:20.240
and submit the data. But anyway having s statistics is much more useful than just
07:20.240 --> 07:28.240
having anecdotal evidences from one or other reddit posts. And just one thing I will do
07:28.880 --> 07:33.280
during the rest of my presentation is that for all the statistics I will normalize
07:33.520 --> 07:41.440
user data so that every user contributes equally. For example if one user submits like 100 hours
07:41.440 --> 07:46.640
Emacs uptime statistics and other users submit one hour Emacs uptime then I will
07:47.200 --> 07:49.520
anyway make it so that they contribute equally.
07:53.280 --> 07:59.280
Let's start from one of the most obvious things we can look into is which is the time it takes
07:59.360 --> 08:05.520
for garbage collection to single garbage collection process. Here you see
08:08.240 --> 08:16.240
frequency distribution of GC duration for all the 129 users we got and
08:17.600 --> 08:26.800
you can see that most of the garbage collections are done quite quickly in less than 0.1 second
08:27.440 --> 08:33.680
and less than 0.1 second is usually just not noticeable. So even though there is garbage
08:33.680 --> 08:43.200
collection it will not interrupt the work in Emacs. However there is a fraction of users who
08:43.920 --> 08:49.680
experience garbage collection it takes like 0.2, 0.3 or even half a second which will be quite
08:49.680 --> 08:58.800
noticeable. For the purposes of this study I will consider that anything that is less than 0.1
08:58.800 --> 09:06.000
second which is insignificant so like you will not notice it and it's like obviously all the Emacs
09:06.000 --> 09:13.600
usage will be just normal. But if it's more than 0.1 or 0.2 seconds then it will be very noticeable
09:13.600 --> 09:20.800
and you will see that Emacs hang for a little while or not so little while. In terms of numbers
09:21.360 --> 09:28.000
it's better to plot the statistics not as a distribution but as a cumulative distribution.
09:29.040 --> 09:34.080
So like at every point of this graph you'll see like for example here 0.4 seconds
09:34.480 --> 09:49.040
you have this percent of like almost 90% of users have no more than 0.4 gc duration. So like
09:49.040 --> 09:55.760
we can look here if we take one gc critical gc duration which is 0.1 second
09:55.840 --> 10:02.400
0.1 second and look at how many users have it so we have 56% which is like
10:03.600 --> 10:12.880
44% users have less than 0.1 second gc duration and the rest 56% have more than 0.1 second.
10:13.600 --> 10:20.720
So you can see like more than half of users actually have noticeable gc delay so the
10:20.720 --> 10:27.040
Emacs freezes for some noticeable time and a quarter of users actually have very noticeable
10:27.040 --> 10:36.640
so like Emacs freezes such that you see an actual delay that Emacs actually has
10:37.760 --> 10:47.600
which is quite significant and important point. But apart from the duration of each individual gc
10:47.600 --> 10:52.640
it is important to see how frequent it is because even if you do notice a delay
10:53.440 --> 10:59.120
even a few seconds delay it doesn't matter if it happens once during the whole Emacs session.
11:01.360 --> 11:10.720
So if you look into frequency distribution again here I plot time between
11:11.680 --> 11:17.760
subsequent garbage collections versus how frequent it is and we have very clear trend that
11:18.560 --> 11:24.560
most of the garbage collections are quite frequent like we talk about every few seconds a few tens
11:24.560 --> 11:32.560
of seconds. There's a few outliers which are at very round numbers like 60 seconds, 120 seconds,
11:32.560 --> 11:40.640
300 seconds. These are usually timers so like you have something running on timer and then it
11:41.440 --> 11:48.000
is complex command and it triggers garbage collection but it's not the majority.
11:49.280 --> 11:54.000
Again to run the numbers it's better to look into cumulative distribution and see that
11:54.000 --> 11:58.160
50% of garbage collections are basically less than 10 seconds apart.
12:00.000 --> 12:07.920
And we can combine it with previous data and we look into whatever garbage collection takes
12:07.920 --> 12:12.960
less than 10 seconds from each other and also takes more than say 0.1 seconds.
12:13.680 --> 12:20.800
So and then we see that one quarter of all garbage collections are just noticeable and also frequent
12:21.760 --> 12:27.840
and 9% are not like more than 0.2% very noticeable and also frequent. So basically
12:27.840 --> 12:34.480
it constitutes Emacs freezing. So 9% of all the garbage collection Emacs freezing. Of course
12:35.360 --> 12:42.960
if you remember there is a bias but 9% is quite significant number. So garbage collection can
12:42.960 --> 12:47.280
really slow down things not for everyone but for significant fraction of users.
12:49.440 --> 12:57.440
Another thing I'd like to look into is what I call agglomerated GCs. What I mean by agglomerated
12:57.440 --> 13:02.720
is when you have one garbage collection and then another garbage immediately after it. So
13:03.680 --> 13:09.840
in terms of numbers I took every subsequent garbage collection which is either immediately
13:09.840 --> 13:16.000
after or no more than one second after each. So from point of view of users is like
13:16.960 --> 13:22.880
multiple garbage collection they add up together into one giant garbage collection.
13:23.440 --> 13:29.440
And if you look into numbers of how many agglomerated garbage collections there are
13:29.440 --> 13:35.360
you can see even numbers over 100. So 100 garbage collection going one after another.
13:36.720 --> 13:42.560
Even if you think about each garbage collection taking 0.1 second we look into 100 of them
13:43.280 --> 13:50.480
it's total 10 seconds. It's like Emacs hanging forever or like a significant number is also 10.
13:50.480 --> 13:58.160
So again this would be very annoying to meet such thing. How frequently does it happen? Again we
13:58.160 --> 14:04.400
can plot cumulative distribution and we see that 20 percent like 19 percent of all the garbage
14:04.400 --> 14:13.680
collection are at least two together and 8 percent like more than 10. So like you think about oh
14:13.680 --> 14:17.840
each garbage collection is not taking much time but when you have 10 of them yeah that becomes a
14:17.840 --> 14:32.560
problem. Another thing is to answer a question that some people complain about is that
14:33.680 --> 14:42.320
longer you use Emacs the slower Emacs become. Of course it may be caused by garbage collection and
14:42.720 --> 14:50.000
I wanted to look into how garbage collection time and other statistics, other parameters
14:50.880 --> 14:58.880
are evolving over time. And what I can see here is a cumulative distribution of GC duration
14:59.680 --> 15:04.720
for like first 10 minutes of Emacs uptime first 100 minutes first 1000 minutes.
15:05.520 --> 15:13.840
And if you look closer then you see that each individual garbage collection on average
15:15.440 --> 15:24.000
takes longer as you use Emacs longer. However this longer is not much it's like maybe 10 percent
15:24.000 --> 15:33.040
like basically garbage collection gets like slow Emacs down more as you use Emacs more
15:33.680 --> 15:40.320
but not much. So basically if you do you see Emacs being slower and slower over time
15:40.960 --> 15:46.960
it's probably not really garbage collection because it doesn't change too much. And if you
15:46.960 --> 15:52.720
look into time between individual garbage collections and you see that the time actually
15:52.720 --> 15:58.880
increases as you use Emacs longer which makes sense because initially like first few minutes
15:58.880 --> 16:04.720
you have all kind of packages loading like all the port loading and then later everything is
16:04.720 --> 16:12.560
loaded and things become more stable. So the conclusion on this part is that
16:13.520 --> 16:18.480
if Emacs becomes slower in a long session it's probably not caused by garbage collection.
16:20.320 --> 16:27.760
And one word of warning of course is that it's all nice and all when I present the statistics
16:27.760 --> 16:32.800
but it's only an average and if you are an actual user like here is one example
16:34.080 --> 16:39.920
which shows a total garbage collection time like accumulated together over Emacs uptime
16:40.880 --> 16:45.360
and you see different lines which correspond to different sessions of one user
16:46.800 --> 16:51.360
and you see they are wildly different like one time there is almost no garbage collection
16:52.240 --> 16:57.840
another time you see garbage collection because probably Emacs is used more early or like
16:57.840 --> 17:04.560
different pattern of usage and even during a single Emacs session you see a different slope
17:04.560 --> 17:10.560
of this curve which means that sometimes garbage collection is infrequent and sometimes it's much
17:10.560 --> 17:16.000
more frequent so it's probably much more noticeable one time and less noticeable other time.
17:16.000 --> 17:23.360
So if you think about these statistics of course they only represent an average usage
17:23.360 --> 17:26.240
but sometimes it can get worse sometimes it can get better.
17:30.320 --> 17:35.600
The last parameter I'd like to talk about is garbage collection during Emacs init.
17:36.960 --> 17:42.320
Basically if you think about what happens during Emacs init like when Emacs just starting up
17:42.320 --> 17:46.720
then whatever garbage collection there it's one or it's several times
17:46.720 --> 17:50.640
it all contributes to Emacs taking longer to start.
17:53.200 --> 18:00.640
And again we can look into the statistic and see what is the total GC duration after Emacs init
18:01.840 --> 18:10.240
and we see that 50% of all the submissions garbage collection adds up more than one second
18:10.240 --> 18:17.760
to Emacs init time and for 20% of users it's extra three seconds Emacs start time which is
18:17.760 --> 18:22.640
very significant especially for people who are used to Vim which can start in like a fraction
18:22.640 --> 18:27.200
of a second and here it just does garbage collection because garbage collection is
18:27.200 --> 18:31.760
not everything Emacs does during startup adds up more to the load.
18:33.680 --> 18:39.280
Okay that's all nice and all but what can we do about these statistics can we draw any
18:39.280 --> 18:46.000
conclusions and the answer is of course like the most important conclusion here is that
18:46.720 --> 18:52.320
yes garbage collection can slow down Emacs at least for some people and what to do about it
18:53.360 --> 18:58.720
there are two variables which you can tweak it's because gcconce threshold gcconce percentage
18:58.720 --> 19:06.400
and having the statistics I can at least look a little bit into what is the effect of
19:06.400 --> 19:12.400
increasing these variables like most people just increase gcconce threshold
19:13.760 --> 19:17.040
and like all the submissions people did increase and
19:17.680 --> 19:20.880
doesn't make much sense to decrease it like to make things worse
19:24.560 --> 19:31.280
of course for these statistics the exact values of this increased thresholds
19:31.680 --> 19:36.320
are not always the same but at least we can look into some trends
19:38.640 --> 19:48.480
so first and obvious thing we can observe is when we compare the standard gc settings
19:49.120 --> 19:57.680
standard thresholds and increased thresholds for time between subsequent gcs and as one may expect
19:57.680 --> 20:03.440
if you increase the threshold Emacs will do garbage collection less frequently so the spacing
20:03.440 --> 20:10.080
between garbage collection increases okay the only thing is that if garbage collection is
20:10.080 --> 20:16.800
less frequent then each individual garbage collection becomes longer so if you think about
20:16.800 --> 20:24.240
increasing garbage collection thresholds be prepared that in each individual time Emacs
20:24.240 --> 20:33.040
freezes will take longer this is one caveat when we talk about this agglomerated gcs which
20:33.040 --> 20:40.160
are one after other like if you increase the threshold sufficiently then whatever happened
20:40.160 --> 20:46.880
that garbage collections were like done one after other we can now make it so that they are actually
20:46.880 --> 20:53.840
separated so like you don't see one giant freeze caused by like 10 gcs in a row instead you can
20:53.840 --> 21:00.880
make it so that they are separated and in statistics it's very clear that the number of
21:00.880 --> 21:06.560
agglomerated garbage collections decreases dramatically when you increase the thresholds
21:07.920 --> 21:11.600
it's particularly evident when we look into startup time
21:13.520 --> 21:19.680
if you look at gc duration during Emacs startup and if we look into what happens when you
21:19.680 --> 21:25.680
increase the thresholds it's very clear that Emacs startup become faster when you increase gc
21:25.680 --> 21:37.120
thresholds so that's all for actual user statistics and now let's try to run into some like actual
21:37.120 --> 21:44.480
recommendations on what numbers to set and before we start let me explain a little bit about
21:44.480 --> 21:48.720
the difference between these two variables which is gc constant threshold and gc constant percentage
21:49.440 --> 21:55.120
so if you think about Emacs memory like there's a certain memory allocated by Emacs
21:56.000 --> 22:00.000
and then as you run commands and turn using Emacs there is more memory allocated
22:01.360 --> 22:07.120
and Emacs decides when to do garbage collection according these two variables and actually what
22:07.120 --> 22:12.880
it does it chooses the larger one so say you have you are late in Emacs session you have a lot of
22:12.880 --> 22:18.960
Emacs memory allocated then you have gc constant percentage which is percent of the already
22:18.960 --> 22:27.360
allocated memory and that percent is probably going to be the largest because you have more memory and
22:28.800 --> 22:34.480
memory means that percent of it is larger so like you have a larger number
22:35.040 --> 22:41.680
cost by gc constant percentage so in this scenario when Emacs session is
22:42.240 --> 22:46.880
already running for a long time and there is a lot of memory allocated you have
22:49.600 --> 22:54.240
gc constant percentage controlling the garbage collection while early in Emacs there is not much
22:54.240 --> 23:00.240
memory placed Emacs just starting up then gc constant threshold is controlling how frequently
23:00.240 --> 23:06.160
garbage collection happens because smaller allocated memory means its percentage will be a
23:06.160 --> 23:14.080
small number so in terms of default values at least gc constant threshold is 800 kilobytes
23:14.800 --> 23:24.080
and gc constant percentage is 10 so gc constant percentage becomes larger than that threshold
23:24.080 --> 23:30.480
when you have more than eight megabytes of allocated memory by Emacs which is quite early
23:30.480 --> 23:37.040
and it will probably hold just during the startup and once you start using your maximum
23:37.040 --> 23:42.080
and once you load all the histories all the kinds of buffers it's probably going to take
23:42.080 --> 23:52.320
more than much more than eight megabytes so now we understand this we can draw certain
23:52.320 --> 24:00.960
recommendations about tweaking the gc thresholds so first of all I need to emphasize that
24:01.760 --> 24:07.840
any time you increase gc threshold an individual garbage collection time increases so it's not
24:07.840 --> 24:12.320
free at all if you don't have problems with garbage collection which is half of the users
24:12.320 --> 24:19.360
don't have much problem you don't need to tweak anything only when gc is frequent and slow
24:19.360 --> 24:27.040
when Emacs is really really present frequently you may consider increasing gc thresholds only
24:28.240 --> 24:35.040
and in particular I recommend increasing gc constant percentage because that's what mostly
24:35.040 --> 24:43.600
controls gc when Emacs is running for long session and the numbers are probably like
24:43.600 --> 24:48.640
yeah we can estimate the effect of these numbers like for example if you have a default value of
24:48.640 --> 24:54.720
0.1 percent for gc constant percentage 0.1 which is 10 percent and then increase it twice
24:55.760 --> 25:02.880
obviously you get twice less frequent gcs but it will come at the cost of extra 10 percent gc time
25:02.880 --> 25:09.840
and if you increase 10 times you can think about 10 less 10 x less frequent gcs but almost twice
25:09.840 --> 25:16.880
longer individual garbage collection time so probably you want to set the number closer to 0.1
25:19.520 --> 25:29.280
another part of the users may actually try to optimize Emacs startup time which is quite frequent
25:29.280 --> 25:37.200
problem in this case it's probably better to increase gc constant but not too much so like
25:37.200 --> 25:42.640
first of all it makes sense to check whether garbage collection is a problem at all during
25:43.520 --> 25:48.160
startup and there are two variables which can show what is happening
25:49.120 --> 25:54.800
this garbage collection so gc done is a variable that shows how many garbage collection
25:57.520 --> 26:02.560
like what is the number of garbage collections triggered like when you check the value or
26:02.560 --> 26:08.320
right after you start Emacs you will see that number and gc elapsed variable
26:09.280 --> 26:15.440
which gives you a number of seconds which Emacs spent in doing garbage collection so this is
26:15.440 --> 26:20.800
probably the most important variable and if you see it's large then you may consider tweaking it
26:20.800 --> 26:30.000
for the Emacs startup we can estimate some bounds because in the statistics I never saw anything
26:30.000 --> 26:35.600
that is more than 10 seconds extra which even 10 seconds is probably like a really really hard
26:35.600 --> 26:45.280
upper bound so or say if you want to decrease the gc contribution like order of magnitude
26:45.920 --> 26:52.080
or like two orders of magnitudes let's say like as a really hard top estimate then it
26:52.080 --> 27:00.080
corresponds to 80 megabytes gc constant and probably much less so like there's no point
27:00.080 --> 27:06.880
setting it to a few hundred megabytes of course there's one caveat which is important to keep in
27:06.880 --> 27:16.800
mind though that increasing the gc thresholds is not just increasing individual gc time
27:16.800 --> 27:23.600
there's also an actual real impact on the RAM usage so like if you increase gc threshold
27:24.400 --> 27:29.600
it increases the RAM usage of Emacs and you shouldn't think that like okay I increased
27:30.480 --> 27:37.200
the threshold by like 100 megabytes then 100 megabytes extra RAM usage doesn't matter
27:37.200 --> 27:44.480
it's not 100 megabytes because less frequent garbage collection means it will lead to
27:44.480 --> 27:51.680
memory fragmentation so in practice if you increase the thresholds to tens or hundreds
27:51.680 --> 27:58.240
of megabytes we are talking about gigabytes extra RAM usage for me personally when I tried to
27:58.240 --> 28:05.200
play with gc thresholds I have seen Emacs taking two gigabytes like compared to several times less
28:05.760 --> 28:12.240
when with default settings so it's not free at all and only like either when you have a lot of
28:12.240 --> 28:19.440
free RAM and you don't care or when your Emacs is really slow then you may need to consider this
28:19.440 --> 28:24.160
tweaking these defaults so again don't tweak defaults if you don't really have a problem
28:24.800 --> 28:31.360
and of course this RAM problem is a big big deal for Emacs devs because from
28:32.960 --> 28:38.400
from the point of single user you have like normal laptop most likely like normal PC with a lot of
28:38.400 --> 28:45.760
RAM you don't care about these things too much but Emacs in general can run on like
28:46.320 --> 28:53.200
all kinds of machines including low-end machines with very limited RAM and anytime Emacs developers
28:53.280 --> 29:00.320
consider increasing the defaults for garbage collection it's like they always have to consider
29:00.320 --> 29:06.800
if you increase them too much then Emacs may just stop running on certain platforms
29:09.840 --> 29:15.600
so that's a very big consideration in terms of the global defaults for everyone
29:16.320 --> 29:24.560
although I have to I would say that it might be related to the safe to increase GCCons threshold
29:24.560 --> 29:29.600
because it mostly affects startup and during startup it's probably not the peak usage of
29:30.560 --> 29:38.160
Emacs and like as Emacs runs for longer it's probably where most of RAM will be used later
29:38.720 --> 29:43.920
on the other hand GCCons percentage is much more debating because it has pros and cons
29:43.920 --> 29:48.880
it will increase the RAM usage it will increase the individual GC time so
29:50.240 --> 29:56.560
if we consider changing it it's much more tricky and we have discussing probably measure the impact
29:56.560 --> 30:06.080
on users and a final note on or from the point of view of Emacs development is that
30:06.480 --> 30:11.440
this simple mark-and-sweep algorithm is like a very old and not the state-of-the-art algorithm
30:13.040 --> 30:16.960
there are variants of garbage collection that are like totally non-blocking
30:18.000 --> 30:22.720
so Emacs just doesn't have to freeze during the garbage collection or there are variants
30:22.720 --> 30:27.440
of garbage collection algorithm that do not scan all the memory just fraction of it
30:28.640 --> 30:35.520
and scan another fraction less frequently so there are actually ways just to
30:36.480 --> 30:39.680
change the garbage collection algorithm to make things much faster
30:40.400 --> 30:47.280
of course like just changing the numbers of variables like the numbers of variable values
30:47.280 --> 30:52.000
is much more tricky and one has to implement it obviously it would be nice if someone implements
30:52.000 --> 30:58.720
it but so far it's not happening so yeah it would be nice but maybe not not so quickly
30:59.600 --> 31:02.080
there is more chance to change the defaults here
31:02.240 --> 31:05.680
to conclude let me reiterate the most important points
31:06.640 --> 31:12.400
so from point of view of users you need to understand that yes garbage collection may be
31:12.400 --> 31:20.480
a problem but not for everyone so like you should only think about changing the variables when you
31:20.480 --> 31:28.240
really know that garbage collection is the problem for you so if you have slow Emacs startup
31:28.400 --> 31:34.000
slow Emacs startup and you know that it's caused by garbage collection like by you can check the
31:34.000 --> 31:41.520
GC elapsed variable then you may increase GC count threshold like to few tens of megabytes
31:41.520 --> 31:48.160
not more it doesn't make sense to increase it much more and if you really have major problems
31:48.160 --> 31:56.080
with Emacs being slaggy then you can increase GC count percentage to like 0.2 0.3 maybe
31:56.080 --> 32:02.640
one is probably overkill but do watch your Emacs ROM usage it may be really impacted
32:04.160 --> 32:12.400
for Emacs developers I'd like to emphasize that there is a real problem with garbage collection
32:12.400 --> 32:22.720
and nine percent of all the garbage collection data points we have correspond to really slow
32:22.720 --> 32:27.920
noticeable Emacs precision and really frequent less than 10 seconds
32:30.000 --> 32:35.120
I'd say that it's really worth increasing GC count threshold at least during startup
32:36.400 --> 32:41.440
because it really impacts the Emacs startup time making Emacs startup much faster
32:42.400 --> 32:48.560
ideally we need to reimplement the garbage collection algorithm of course it's not easy
32:48.560 --> 32:56.880
but it would be really nice and for GC count percentage defaults it's hard to say we may
32:56.880 --> 33:03.040
consider changing it but it's up to discussion and we probably need to be conservative here
33:04.320 --> 33:11.280
so we came to the end of my talk and this presentation all the data will be available
33:11.280 --> 33:21.760
publicly and you can reproduce all the statistic graphs if you wish and thank you for attention
|