diff options
author | EmacsConf <emacsconf-org@gnu.org> | 2023-12-03 13:33:54 -0500 |
---|---|---|
committer | EmacsConf <emacsconf-org@gnu.org> | 2023-12-03 13:33:54 -0500 |
commit | f09328acb5ffee2db8db2db933e70a9095936aec (patch) | |
tree | 01f7ac83359f2df94705a76bc04ebbf3cd2d39fd /2023/captions | |
parent | d81b7fb71bb6b1050e05a45490bf70c7c13363e5 (diff) | |
download | emacsconf-wiki-f09328acb5ffee2db8db2db933e70a9095936aec.tar.xz emacsconf-wiki-f09328acb5ffee2db8db2db933e70a9095936aec.zip |
Automated commit
Diffstat (limited to '')
-rw-r--r-- | 2023/captions/emacsconf-2023-gc--emacsgcstats-does-garbage-collection-actually-slow-down-emacs--ihor-radchenko--main.vtt | 848 |
1 files changed, 848 insertions, 0 deletions
diff --git a/2023/captions/emacsconf-2023-gc--emacsgcstats-does-garbage-collection-actually-slow-down-emacs--ihor-radchenko--main.vtt b/2023/captions/emacsconf-2023-gc--emacsgcstats-does-garbage-collection-actually-slow-down-emacs--ihor-radchenko--main.vtt new file mode 100644 index 00000000..c1bea8de --- /dev/null +++ b/2023/captions/emacsconf-2023-gc--emacsgcstats-does-garbage-collection-actually-slow-down-emacs--ihor-radchenko--main.vtt @@ -0,0 +1,848 @@ +WEBVTT + +00:00.000 --> 00:06.480 +Hello everyone, my name is Igor Achinko and you may know me from Org Mailing List. + +00:07.440 --> 00:11.760 +However, today I'm not going to talk about Org Mode. Today I'm going to talk about + +00:11.760 --> 00:16.800 +Emacs performance and how it's affected by its memory management code. + +00:18.880 --> 00:24.720 +First, I will introduce the basic concepts of Emacs memory management and what garbage + +00:24.720 --> 00:32.320 +collection is. Then I will show you user statistics collected from volunteer users + +00:32.320 --> 00:42.080 +over the last half year and I will end with some guidelines on how to tweak Emacs garbage + +00:42.080 --> 00:48.640 +collection customizations to optimize Emacs performance and when it's necessary or not + +00:49.120 --> 00:56.560 +to do. Let's begin. What is garbage collection? To understand what is garbage collection we need + +00:56.560 --> 01:01.920 +to realize that anything you do in Emacs is some kind of command and any command is most likely + +01:01.920 --> 01:07.280 +running some Elisp code and every time you run Elisp code you most likely need to locate certain + +01:07.280 --> 01:14.160 +memory in RAM and some of this memory is retained for a long time and some of this memory is + +01:14.160 --> 01:20.320 +transient. Of course, Emacs has to clear this transient memory from time to time to not occupy + +01:20.320 --> 01:27.200 +all the possible RAM in the computer. In this small example we have one global variable + +01:28.480 --> 01:35.600 +that is assigned a value but when assigning the value we first allocate a temporary variable + +01:35.600 --> 01:41.360 +and then a temporary list and only retain some part of this list in this global variable. + +01:42.240 --> 01:51.920 +In terms of memory graph we can represent this as two variable slots, one transient, one permanent + +01:52.480 --> 02:01.680 +and then a list of three concerns and part of which is retained as a global variable but part + +02:01.680 --> 02:07.280 +of it which is a temporary variable symbol and the first term of the list is not used and it + +02:07.840 --> 02:15.040 +might be cleared at some point. So that's what Emacs does. Every now and then Emacs goes through + +02:15.040 --> 02:20.320 +all the memory and identify which part of the memory are not used and then clear them so that + +02:20.320 --> 02:27.760 +it can free up the RAM. This process is called garbage collection and Emacs uses a very simple + +02:27.760 --> 02:33.440 +and old algorithm which is called mark and sweep. So during this mark and sweep process + +02:33.440 --> 02:40.880 +is basically two stages. First Emacs scans all the memory that is allocated and then identify + +02:40.880 --> 02:46.320 +which memory is still in use which is linked to some variables for example and which memory is + +02:46.320 --> 02:51.600 +not used anymore even though it was allocated in the past and the second stage is clear that + +02:51.600 --> 02:56.240 +whatever a memory is not that is not allocated. During the process + +02:56.880 --> 03:03.920 +Emacs cannot do anything now. So basically every time Emacs scans the memory it freezes up and + +03:03.920 --> 03:09.840 +doesn't respond to anything and if it takes too much time so that users can notice it then of + +03:09.840 --> 03:18.160 +course Emacs is not responsive at all and if this garbage collection is triggered too frequently + +03:18.160 --> 03:23.760 +then it's not just not responsive every now and then it's also not responsive all the time almost + +03:24.000 --> 03:29.840 +all the time so it cannot even normally type or stuff or do some normal commands. + +03:32.320 --> 03:40.080 +This mark and sweep algorithm is taking longer the more memory Emacs uses. So basically + +03:40.080 --> 03:46.480 +the more buffers you open, the more packages you load, the more complex commands you run, + +03:46.480 --> 03:55.840 +the more memory is used and basically the longer Emacs takes to perform a single garbage collection. + +04:00.560 --> 04:07.280 +Of course Emacs being Emacs and this garbage collection can be tweaked. In particular + +04:07.280 --> 04:12.960 +users can tweak how frequently Emacs does garbage collection using two basic variables + +04:12.960 --> 04:19.840 +GCConsThreshold and GCConsPercentage. GCConsThreshold is the raw number of kilobytes + +04:21.440 --> 04:27.200 +Emacs needs to allocate before triggering another garbage collection and the GCConsPercentage + +04:27.200 --> 04:31.680 +is similar but it's defined in terms of fraction of already allocated memory. + +04:33.840 --> 04:41.840 +If you follow various Emacs forums you may be familiar with people complaining about + +04:41.840 --> 04:47.760 +garbage collection and there are many many suggestions about what to do with it. + +04:50.320 --> 04:52.640 +Most frequently you see GCConsThreshold + +04:54.640 --> 05:01.280 +recommended to be increased and a number of pre-packaged Emacs distributions like + +05:01.280 --> 05:07.280 +DoMeEmacs do increase it or like I have seen suggestions which are actually horrible to + +05:07.280 --> 05:11.120 +disable garbage collection temporarily or for a long time. + +05:14.240 --> 05:19.600 +Which is nice you can see it quite frequently which indicates there might be some problem. + +05:19.600 --> 05:26.320 +However every time like one user poses about this problem it's just one data point and it doesn't + +05:26.320 --> 05:30.000 +mean that everyone actually suffers from it. It doesn't mean that everyone should do it. + +05:30.720 --> 05:37.680 +So in order to understand if this garbage collection is really a problem which is a + +05:37.680 --> 05:48.000 +common problem we do need some kind of statistics and only using the actual statistics we can + +05:48.000 --> 05:54.880 +understand if it should be recommended for everyone to tweak the defaults or like whether + +05:54.880 --> 06:00.000 +it should be recommended for certain users or maybe it should be asked Emacs devs to do + +06:00.000 --> 06:08.800 +something about the defaults. And what I did some time ago is exactly this. I tried to collect the + +06:08.800 --> 06:18.000 +user statistics. So I wrote a small package on Elp and some users installed this package and + +06:18.000 --> 06:24.080 +then reported back these statistics of the garbage collection for their particular use. + +06:25.360 --> 06:33.840 +By now we have obtained 129 user submissions with over 1 million GC records in there. + +06:35.760 --> 06:42.320 +So like some of these submissions used default GC settings without any customizations. + +06:42.320 --> 06:47.040 +Some used increased GC cost threshold and GC cost percentage. + +06:48.880 --> 06:56.640 +So using this data we can try to draw some reliable conclusions on what should be done + +06:56.640 --> 07:02.480 +and whether should anything be done about garbage collection on Emacs dev level or at least on user + +07:02.480 --> 07:08.240 +level. Of course we need to keep in mind that there's some kind of bias because it's more + +07:08.240 --> 07:13.680 +likely that users already have problems with GC or they think they have problems with GC will report + +07:14.480 --> 07:20.240 +and submit the data. But anyway having s statistics is much more useful than just + +07:20.240 --> 07:28.240 +having anecdotal evidences from one or other reddit posts. And just one thing I will do + +07:28.880 --> 07:33.280 +during the rest of my presentation is that for all the statistics I will normalize + +07:33.520 --> 07:41.440 +user data so that every user contributes equally. For example if one user submits like 100 hours + +07:41.440 --> 07:46.640 +Emacs uptime statistics and other users submit one hour Emacs uptime then I will + +07:47.200 --> 07:49.520 +anyway make it so that they contribute equally. + +07:53.280 --> 07:59.280 +Let's start from one of the most obvious things we can look into is which is the time it takes + +07:59.360 --> 08:05.520 +for garbage collection to single garbage collection process. Here you see + +08:08.240 --> 08:16.240 +frequency distribution of GC duration for all the 129 users we got and + +08:17.600 --> 08:26.800 +you can see that most of the garbage collections are done quite quickly in less than 0.1 second + +08:27.440 --> 08:33.680 +and less than 0.1 second is usually just not noticeable. So even though there is garbage + +08:33.680 --> 08:43.200 +collection it will not interrupt the work in Emacs. However there is a fraction of users who + +08:43.920 --> 08:49.680 +experience garbage collection it takes like 0.2, 0.3 or even half a second which will be quite + +08:49.680 --> 08:58.800 +noticeable. For the purposes of this study I will consider that anything that is less than 0.1 + +08:58.800 --> 09:06.000 +second which is insignificant so like you will not notice it and it's like obviously all the Emacs + +09:06.000 --> 09:13.600 +usage will be just normal. But if it's more than 0.1 or 0.2 seconds then it will be very noticeable + +09:13.600 --> 09:20.800 +and you will see that Emacs hang for a little while or not so little while. In terms of numbers + +09:21.360 --> 09:28.000 +it's better to plot the statistics not as a distribution but as a cumulative distribution. + +09:29.040 --> 09:34.080 +So like at every point of this graph you'll see like for example here 0.4 seconds + +09:34.480 --> 09:49.040 +you have this percent of like almost 90% of users have no more than 0.4 gc duration. So like + +09:49.040 --> 09:55.760 +we can look here if we take one gc critical gc duration which is 0.1 second + +09:55.840 --> 10:02.400 +0.1 second and look at how many users have it so we have 56% which is like + +10:03.600 --> 10:12.880 +44% users have less than 0.1 second gc duration and the rest 56% have more than 0.1 second. + +10:13.600 --> 10:20.720 +So you can see like more than half of users actually have noticeable gc delay so the + +10:20.720 --> 10:27.040 +Emacs freezes for some noticeable time and a quarter of users actually have very noticeable + +10:27.040 --> 10:36.640 +so like Emacs freezes such that you see an actual delay that Emacs actually has + +10:37.760 --> 10:47.600 +which is quite significant and important point. But apart from the duration of each individual gc + +10:47.600 --> 10:52.640 +it is important to see how frequent it is because even if you do notice a delay + +10:53.440 --> 10:59.120 +even a few seconds delay it doesn't matter if it happens once during the whole Emacs session. + +11:01.360 --> 11:10.720 +So if you look into frequency distribution again here I plot time between + +11:11.680 --> 11:17.760 +subsequent garbage collections versus how frequent it is and we have very clear trend that + +11:18.560 --> 11:24.560 +most of the garbage collections are quite frequent like we talk about every few seconds a few tens + +11:24.560 --> 11:32.560 +of seconds. There's a few outliers which are at very round numbers like 60 seconds, 120 seconds, + +11:32.560 --> 11:40.640 +300 seconds. These are usually timers so like you have something running on timer and then it + +11:41.440 --> 11:48.000 +is complex command and it triggers garbage collection but it's not the majority. + +11:49.280 --> 11:54.000 +Again to run the numbers it's better to look into cumulative distribution and see that + +11:54.000 --> 11:58.160 +50% of garbage collections are basically less than 10 seconds apart. + +12:00.000 --> 12:07.920 +And we can combine it with previous data and we look into whatever garbage collection takes + +12:07.920 --> 12:12.960 +less than 10 seconds from each other and also takes more than say 0.1 seconds. + +12:13.680 --> 12:20.800 +So and then we see that one quarter of all garbage collections are just noticeable and also frequent + +12:21.760 --> 12:27.840 +and 9% are not like more than 0.2% very noticeable and also frequent. So basically + +12:27.840 --> 12:34.480 +it constitutes Emacs freezing. So 9% of all the garbage collection Emacs freezing. Of course + +12:35.360 --> 12:42.960 +if you remember there is a bias but 9% is quite significant number. So garbage collection can + +12:42.960 --> 12:47.280 +really slow down things not for everyone but for significant fraction of users. + +12:49.440 --> 12:57.440 +Another thing I'd like to look into is what I call agglomerated GCs. What I mean by agglomerated + +12:57.440 --> 13:02.720 +is when you have one garbage collection and then another garbage immediately after it. So + +13:03.680 --> 13:09.840 +in terms of numbers I took every subsequent garbage collection which is either immediately + +13:09.840 --> 13:16.000 +after or no more than one second after each. So from point of view of users is like + +13:16.960 --> 13:22.880 +multiple garbage collection they add up together into one giant garbage collection. + +13:23.440 --> 13:29.440 +And if you look into numbers of how many agglomerated garbage collections there are + +13:29.440 --> 13:35.360 +you can see even numbers over 100. So 100 garbage collection going one after another. + +13:36.720 --> 13:42.560 +Even if you think about each garbage collection taking 0.1 second we look into 100 of them + +13:43.280 --> 13:50.480 +it's total 10 seconds. It's like Emacs hanging forever or like a significant number is also 10. + +13:50.480 --> 13:58.160 +So again this would be very annoying to meet such thing. How frequently does it happen? Again we + +13:58.160 --> 14:04.400 +can plot cumulative distribution and we see that 20 percent like 19 percent of all the garbage + +14:04.400 --> 14:13.680 +collection are at least two together and 8 percent like more than 10. So like you think about oh + +14:13.680 --> 14:17.840 +each garbage collection is not taking much time but when you have 10 of them yeah that becomes a + +14:17.840 --> 14:32.560 +problem. Another thing is to answer a question that some people complain about is that + +14:33.680 --> 14:42.320 +longer you use Emacs the slower Emacs become. Of course it may be caused by garbage collection and + +14:42.720 --> 14:50.000 +I wanted to look into how garbage collection time and other statistics, other parameters + +14:50.880 --> 14:58.880 +are evolving over time. And what I can see here is a cumulative distribution of GC duration + +14:59.680 --> 15:04.720 +for like first 10 minutes of Emacs uptime first 100 minutes first 1000 minutes. + +15:05.520 --> 15:13.840 +And if you look closer then you see that each individual garbage collection on average + +15:15.440 --> 15:24.000 +takes longer as you use Emacs longer. However this longer is not much it's like maybe 10 percent + +15:24.000 --> 15:33.040 +like basically garbage collection gets like slow Emacs down more as you use Emacs more + +15:33.680 --> 15:40.320 +but not much. So basically if you do you see Emacs being slower and slower over time + +15:40.960 --> 15:46.960 +it's probably not really garbage collection because it doesn't change too much. And if you + +15:46.960 --> 15:52.720 +look into time between individual garbage collections and you see that the time actually + +15:52.720 --> 15:58.880 +increases as you use Emacs longer which makes sense because initially like first few minutes + +15:58.880 --> 16:04.720 +you have all kind of packages loading like all the port loading and then later everything is + +16:04.720 --> 16:12.560 +loaded and things become more stable. So the conclusion on this part is that + +16:13.520 --> 16:18.480 +if Emacs becomes slower in a long session it's probably not caused by garbage collection. + +16:20.320 --> 16:27.760 +And one word of warning of course is that it's all nice and all when I present the statistics + +16:27.760 --> 16:32.800 +but it's only an average and if you are an actual user like here is one example + +16:34.080 --> 16:39.920 +which shows a total garbage collection time like accumulated together over Emacs uptime + +16:40.880 --> 16:45.360 +and you see different lines which correspond to different sessions of one user + +16:46.800 --> 16:51.360 +and you see they are wildly different like one time there is almost no garbage collection + +16:52.240 --> 16:57.840 +another time you see garbage collection because probably Emacs is used more early or like + +16:57.840 --> 17:04.560 +different pattern of usage and even during a single Emacs session you see a different slope + +17:04.560 --> 17:10.560 +of this curve which means that sometimes garbage collection is infrequent and sometimes it's much + +17:10.560 --> 17:16.000 +more frequent so it's probably much more noticeable one time and less noticeable other time. + +17:16.000 --> 17:23.360 +So if you think about these statistics of course they only represent an average usage + +17:23.360 --> 17:26.240 +but sometimes it can get worse sometimes it can get better. + +17:30.320 --> 17:35.600 +The last parameter I'd like to talk about is garbage collection during Emacs init. + +17:36.960 --> 17:42.320 +Basically if you think about what happens during Emacs init like when Emacs just starting up + +17:42.320 --> 17:46.720 +then whatever garbage collection there it's one or it's several times + +17:46.720 --> 17:50.640 +it all contributes to Emacs taking longer to start. + +17:53.200 --> 18:00.640 +And again we can look into the statistic and see what is the total GC duration after Emacs init + +18:01.840 --> 18:10.240 +and we see that 50% of all the submissions garbage collection adds up more than one second + +18:10.240 --> 18:17.760 +to Emacs init time and for 20% of users it's extra three seconds Emacs start time which is + +18:17.760 --> 18:22.640 +very significant especially for people who are used to Vim which can start in like a fraction + +18:22.640 --> 18:27.200 +of a second and here it just does garbage collection because garbage collection is + +18:27.200 --> 18:31.760 +not everything Emacs does during startup adds up more to the load. + +18:33.680 --> 18:39.280 +Okay that's all nice and all but what can we do about these statistics can we draw any + +18:39.280 --> 18:46.000 +conclusions and the answer is of course like the most important conclusion here is that + +18:46.720 --> 18:52.320 +yes garbage collection can slow down Emacs at least for some people and what to do about it + +18:53.360 --> 18:58.720 +there are two variables which you can tweak it's because gcconce threshold gcconce percentage + +18:58.720 --> 19:06.400 +and having the statistics I can at least look a little bit into what is the effect of + +19:06.400 --> 19:12.400 +increasing these variables like most people just increase gcconce threshold + +19:13.760 --> 19:17.040 +and like all the submissions people did increase and + +19:17.680 --> 19:20.880 +doesn't make much sense to decrease it like to make things worse + +19:24.560 --> 19:31.280 +of course for these statistics the exact values of this increased thresholds + +19:31.680 --> 19:36.320 +are not always the same but at least we can look into some trends + +19:38.640 --> 19:48.480 +so first and obvious thing we can observe is when we compare the standard gc settings + +19:49.120 --> 19:57.680 +standard thresholds and increased thresholds for time between subsequent gcs and as one may expect + +19:57.680 --> 20:03.440 +if you increase the threshold Emacs will do garbage collection less frequently so the spacing + +20:03.440 --> 20:10.080 +between garbage collection increases okay the only thing is that if garbage collection is + +20:10.080 --> 20:16.800 +less frequent then each individual garbage collection becomes longer so if you think about + +20:16.800 --> 20:24.240 +increasing garbage collection thresholds be prepared that in each individual time Emacs + +20:24.240 --> 20:33.040 +freezes will take longer this is one caveat when we talk about this agglomerated gcs which + +20:33.040 --> 20:40.160 +are one after other like if you increase the threshold sufficiently then whatever happened + +20:40.160 --> 20:46.880 +that garbage collections were like done one after other we can now make it so that they are actually + +20:46.880 --> 20:53.840 +separated so like you don't see one giant freeze caused by like 10 gcs in a row instead you can + +20:53.840 --> 21:00.880 +make it so that they are separated and in statistics it's very clear that the number of + +21:00.880 --> 21:06.560 +agglomerated garbage collections decreases dramatically when you increase the thresholds + +21:07.920 --> 21:11.600 +it's particularly evident when we look into startup time + +21:13.520 --> 21:19.680 +if you look at gc duration during Emacs startup and if we look into what happens when you + +21:19.680 --> 21:25.680 +increase the thresholds it's very clear that Emacs startup become faster when you increase gc + +21:25.680 --> 21:37.120 +thresholds so that's all for actual user statistics and now let's try to run into some like actual + +21:37.120 --> 21:44.480 +recommendations on what numbers to set and before we start let me explain a little bit about + +21:44.480 --> 21:48.720 +the difference between these two variables which is gc constant threshold and gc constant percentage + +21:49.440 --> 21:55.120 +so if you think about Emacs memory like there's a certain memory allocated by Emacs + +21:56.000 --> 22:00.000 +and then as you run commands and turn using Emacs there is more memory allocated + +22:01.360 --> 22:07.120 +and Emacs decides when to do garbage collection according these two variables and actually what + +22:07.120 --> 22:12.880 +it does it chooses the larger one so say you have you are late in Emacs session you have a lot of + +22:12.880 --> 22:18.960 +Emacs memory allocated then you have gc constant percentage which is percent of the already + +22:18.960 --> 22:27.360 +allocated memory and that percent is probably going to be the largest because you have more memory and + +22:28.800 --> 22:34.480 +memory means that percent of it is larger so like you have a larger number + +22:35.040 --> 22:41.680 +cost by gc constant percentage so in this scenario when Emacs session is + +22:42.240 --> 22:46.880 +already running for a long time and there is a lot of memory allocated you have + +22:49.600 --> 22:54.240 +gc constant percentage controlling the garbage collection while early in Emacs there is not much + +22:54.240 --> 23:00.240 +memory placed Emacs just starting up then gc constant threshold is controlling how frequently + +23:00.240 --> 23:06.160 +garbage collection happens because smaller allocated memory means its percentage will be a + +23:06.160 --> 23:14.080 +small number so in terms of default values at least gc constant threshold is 800 kilobytes + +23:14.800 --> 23:24.080 +and gc constant percentage is 10 so gc constant percentage becomes larger than that threshold + +23:24.080 --> 23:30.480 +when you have more than eight megabytes of allocated memory by Emacs which is quite early + +23:30.480 --> 23:37.040 +and it will probably hold just during the startup and once you start using your maximum + +23:37.040 --> 23:42.080 +and once you load all the histories all the kinds of buffers it's probably going to take + +23:42.080 --> 23:52.320 +more than much more than eight megabytes so now we understand this we can draw certain + +23:52.320 --> 24:00.960 +recommendations about tweaking the gc thresholds so first of all I need to emphasize that + +24:01.760 --> 24:07.840 +any time you increase gc threshold an individual garbage collection time increases so it's not + +24:07.840 --> 24:12.320 +free at all if you don't have problems with garbage collection which is half of the users + +24:12.320 --> 24:19.360 +don't have much problem you don't need to tweak anything only when gc is frequent and slow + +24:19.360 --> 24:27.040 +when Emacs is really really present frequently you may consider increasing gc thresholds only + +24:28.240 --> 24:35.040 +and in particular I recommend increasing gc constant percentage because that's what mostly + +24:35.040 --> 24:43.600 +controls gc when Emacs is running for long session and the numbers are probably like + +24:43.600 --> 24:48.640 +yeah we can estimate the effect of these numbers like for example if you have a default value of + +24:48.640 --> 24:54.720 +0.1 percent for gc constant percentage 0.1 which is 10 percent and then increase it twice + +24:55.760 --> 25:02.880 +obviously you get twice less frequent gcs but it will come at the cost of extra 10 percent gc time + +25:02.880 --> 25:09.840 +and if you increase 10 times you can think about 10 less 10 x less frequent gcs but almost twice + +25:09.840 --> 25:16.880 +longer individual garbage collection time so probably you want to set the number closer to 0.1 + +25:19.520 --> 25:29.280 +another part of the users may actually try to optimize Emacs startup time which is quite frequent + +25:29.280 --> 25:37.200 +problem in this case it's probably better to increase gc constant but not too much so like + +25:37.200 --> 25:42.640 +first of all it makes sense to check whether garbage collection is a problem at all during + +25:43.520 --> 25:48.160 +startup and there are two variables which can show what is happening + +25:49.120 --> 25:54.800 +this garbage collection so gc done is a variable that shows how many garbage collection + +25:57.520 --> 26:02.560 +like what is the number of garbage collections triggered like when you check the value or + +26:02.560 --> 26:08.320 +right after you start Emacs you will see that number and gc elapsed variable + +26:09.280 --> 26:15.440 +which gives you a number of seconds which Emacs spent in doing garbage collection so this is + +26:15.440 --> 26:20.800 +probably the most important variable and if you see it's large then you may consider tweaking it + +26:20.800 --> 26:30.000 +for the Emacs startup we can estimate some bounds because in the statistics I never saw anything + +26:30.000 --> 26:35.600 +that is more than 10 seconds extra which even 10 seconds is probably like a really really hard + +26:35.600 --> 26:45.280 +upper bound so or say if you want to decrease the gc contribution like order of magnitude + +26:45.920 --> 26:52.080 +or like two orders of magnitudes let's say like as a really hard top estimate then it + +26:52.080 --> 27:00.080 +corresponds to 80 megabytes gc constant and probably much less so like there's no point + +27:00.080 --> 27:06.880 +setting it to a few hundred megabytes of course there's one caveat which is important to keep in + +27:06.880 --> 27:16.800 +mind though that increasing the gc thresholds is not just increasing individual gc time + +27:16.800 --> 27:23.600 +there's also an actual real impact on the RAM usage so like if you increase gc threshold + +27:24.400 --> 27:29.600 +it increases the RAM usage of Emacs and you shouldn't think that like okay I increased + +27:30.480 --> 27:37.200 +the threshold by like 100 megabytes then 100 megabytes extra RAM usage doesn't matter + +27:37.200 --> 27:44.480 +it's not 100 megabytes because less frequent garbage collection means it will lead to + +27:44.480 --> 27:51.680 +memory fragmentation so in practice if you increase the thresholds to tens or hundreds + +27:51.680 --> 27:58.240 +of megabytes we are talking about gigabytes extra RAM usage for me personally when I tried to + +27:58.240 --> 28:05.200 +play with gc thresholds I have seen Emacs taking two gigabytes like compared to several times less + +28:05.760 --> 28:12.240 +when with default settings so it's not free at all and only like either when you have a lot of + +28:12.240 --> 28:19.440 +free RAM and you don't care or when your Emacs is really slow then you may need to consider this + +28:19.440 --> 28:24.160 +tweaking these defaults so again don't tweak defaults if you don't really have a problem + +28:24.800 --> 28:31.360 +and of course this RAM problem is a big big deal for Emacs devs because from + +28:32.960 --> 28:38.400 +from the point of single user you have like normal laptop most likely like normal PC with a lot of + +28:38.400 --> 28:45.760 +RAM you don't care about these things too much but Emacs in general can run on like + +28:46.320 --> 28:53.200 +all kinds of machines including low-end machines with very limited RAM and anytime Emacs developers + +28:53.280 --> 29:00.320 +consider increasing the defaults for garbage collection it's like they always have to consider + +29:00.320 --> 29:06.800 +if you increase them too much then Emacs may just stop running on certain platforms + +29:09.840 --> 29:15.600 +so that's a very big consideration in terms of the global defaults for everyone + +29:16.320 --> 29:24.560 +although I have to I would say that it might be related to the safe to increase GCCons threshold + +29:24.560 --> 29:29.600 +because it mostly affects startup and during startup it's probably not the peak usage of + +29:30.560 --> 29:38.160 +Emacs and like as Emacs runs for longer it's probably where most of RAM will be used later + +29:38.720 --> 29:43.920 +on the other hand GCCons percentage is much more debating because it has pros and cons + +29:43.920 --> 29:48.880 +it will increase the RAM usage it will increase the individual GC time so + +29:50.240 --> 29:56.560 +if we consider changing it it's much more tricky and we have discussing probably measure the impact + +29:56.560 --> 30:06.080 +on users and a final note on or from the point of view of Emacs development is that + +30:06.480 --> 30:11.440 +this simple mark-and-sweep algorithm is like a very old and not the state-of-the-art algorithm + +30:13.040 --> 30:16.960 +there are variants of garbage collection that are like totally non-blocking + +30:18.000 --> 30:22.720 +so Emacs just doesn't have to freeze during the garbage collection or there are variants + +30:22.720 --> 30:27.440 +of garbage collection algorithm that do not scan all the memory just fraction of it + +30:28.640 --> 30:35.520 +and scan another fraction less frequently so there are actually ways just to + +30:36.480 --> 30:39.680 +change the garbage collection algorithm to make things much faster + +30:40.400 --> 30:47.280 +of course like just changing the numbers of variables like the numbers of variable values + +30:47.280 --> 30:52.000 +is much more tricky and one has to implement it obviously it would be nice if someone implements + +30:52.000 --> 30:58.720 +it but so far it's not happening so yeah it would be nice but maybe not not so quickly + +30:59.600 --> 31:02.080 +there is more chance to change the defaults here + +31:02.240 --> 31:05.680 +to conclude let me reiterate the most important points + +31:06.640 --> 31:12.400 +so from point of view of users you need to understand that yes garbage collection may be + +31:12.400 --> 31:20.480 +a problem but not for everyone so like you should only think about changing the variables when you + +31:20.480 --> 31:28.240 +really know that garbage collection is the problem for you so if you have slow Emacs startup + +31:28.400 --> 31:34.000 +slow Emacs startup and you know that it's caused by garbage collection like by you can check the + +31:34.000 --> 31:41.520 +GC elapsed variable then you may increase GC count threshold like to few tens of megabytes + +31:41.520 --> 31:48.160 +not more it doesn't make sense to increase it much more and if you really have major problems + +31:48.160 --> 31:56.080 +with Emacs being slaggy then you can increase GC count percentage to like 0.2 0.3 maybe + +31:56.080 --> 32:02.640 +one is probably overkill but do watch your Emacs ROM usage it may be really impacted + +32:04.160 --> 32:12.400 +for Emacs developers I'd like to emphasize that there is a real problem with garbage collection + +32:12.400 --> 32:22.720 +and nine percent of all the garbage collection data points we have correspond to really slow + +32:22.720 --> 32:27.920 +noticeable Emacs precision and really frequent less than 10 seconds + +32:30.000 --> 32:35.120 +I'd say that it's really worth increasing GC count threshold at least during startup + +32:36.400 --> 32:41.440 +because it really impacts the Emacs startup time making Emacs startup much faster + +32:42.400 --> 32:48.560 +ideally we need to reimplement the garbage collection algorithm of course it's not easy + +32:48.560 --> 32:56.880 +but it would be really nice and for GC count percentage defaults it's hard to say we may + +32:56.880 --> 33:03.040 +consider changing it but it's up to discussion and we probably need to be conservative here + +33:04.320 --> 33:11.280 +so we came to the end of my talk and this presentation all the data will be available + +33:11.280 --> 33:21.760 +publicly and you can reproduce all the statistic graphs if you wish and thank you for attention + |