WEBVTT captioned by sachac
NOTE Introduction
00:00:00.000 --> 00:00:06.839
Hello everyone and thanks for tuning in. I'm Timothy,
00:00:06.840 --> 00:00:08.559
and in this talk, we'll be going over
00:00:08.560 --> 00:00:11.342
the 2022 Emacs User Survey.
00:00:11.970 --> 00:00:15.078
Since this is the first time we're discussing this,
00:00:15.079 --> 00:00:18.399
we'll be going over the survey itself a bit,
00:00:18.400 --> 00:00:21.199
how it's being put together and run,
00:00:21.200 --> 00:00:24.199
and then we'll have a little taste of the results
00:00:24.200 --> 00:00:26.039
with more analysis to be published in the future.
NOTE The 2020 Emacs User Survey
00:00:26.040 --> 00:00:32.399
To start with though, a bit of background.
00:00:32.400 --> 00:00:36.679
So in 2020, we had an Emacs User Survey
00:00:36.680 --> 00:00:38.839
run by Adrien Brochard.
00:00:38.840 --> 00:00:41.359
Now this is, to the best of my knowledge,
00:00:41.360 --> 00:00:45.559
the first time that a large-scale Emacs User Survey
00:00:45.560 --> 00:00:48.039
has actually been run.
00:00:48.040 --> 00:00:50.439
About 7,000 people responded to the survey,
00:00:50.440 --> 00:00:53.239
so in many respects, it was quite successful.
00:00:53.240 --> 00:00:56.519
And what's significant about this is that
00:00:56.520 --> 00:00:57.679
with this being the first time
00:00:57.680 --> 00:00:59.999
that a large-scale survey has been run,
00:01:00.000 --> 00:01:01.719
it actually provided some insight
00:01:01.720 --> 00:01:06.719
into questions about how the community is using Emacs
00:01:06.720 --> 00:01:09.959
that allow for much better guesses
00:01:09.960 --> 00:01:15.359
than just speculation based on the small number of people
00:01:15.360 --> 00:01:16.919
who respond on the mailing list usually.
00:01:16.920 --> 00:01:24.879
So, why are we doing another survey? Well, to start with,
00:01:24.880 --> 00:01:28.799
in order to get the most value out of an Emacs User Survey,
00:01:28.800 --> 00:01:32.519
it's quite helpful if the information in it is recent.
00:01:32.520 --> 00:01:35.439
Furthermore, we can actually get some more value
00:01:35.440 --> 00:01:38.039
if we can examine trends,
00:01:38.040 --> 00:01:41.199
shifts in the way that people are using Emacs,
00:01:41.200 --> 00:01:42.919
where the pain points lie,
00:01:42.920 --> 00:01:45.479
what people are enjoying the most, etc.
00:01:45.480 --> 00:01:46.520
So in both of these respects,
00:01:46.521 --> 00:01:49.599
it's to our benefit if the survey
00:01:49.600 --> 00:01:51.519
is actually a regular event,
00:01:51.520 --> 00:01:54.359
instead of just something that's run once.
NOTE The design of the survey
00:01:54.360 --> 00:01:57.159
Now, with this in mind,
00:01:57.160 --> 00:02:00.959
we ran the 2022 Emacs User Survey with the plan
00:02:00.960 --> 00:02:05.079
that this will actually become an annual event.
00:02:05.080 --> 00:02:08.999
In the design of the survey, there are a few goals here.
00:02:09.000 --> 00:02:11.520
The main one is of the user community.
00:02:11.521 --> 00:02:14.520
Now, user community is a rather nebulous phrase.
00:02:14.521 --> 00:02:17.520
In this case, what's meant in particular
00:02:17.521 --> 00:02:21.020
is value in questions, for example,
00:02:21.021 --> 00:02:23.839
things like pain points with Emacs,
00:02:23.840 --> 00:02:27.119
which versions people are using,
00:02:27.120 --> 00:02:30.239
which capabilities people are making the most use of,
00:02:30.240 --> 00:02:34.519
which could potentially be helpful to both emacs-devel
00:02:34.520 --> 00:02:36.520
but also our collection of Emacs package maintainers
00:02:36.521 --> 00:02:38.020
and the whole community.
00:02:38.021 --> 00:02:40.799
Actually, I think going beyond just the packages,
00:02:40.800 --> 00:02:46.039
we've also got the people who develop tutorials, guides,
00:02:46.040 --> 00:02:49.279
and all of that sort of surrounding activity,
00:02:49.280 --> 00:02:51.020
which can benefit from a clear understanding
00:02:51.021 --> 00:02:56.020
of how Emacs users use Emacs.
00:02:56.021 --> 00:02:58.519
Separately to that,
00:02:58.520 --> 00:03:01.639
I think as an Emacs user myself,
00:03:01.640 --> 00:03:02.839
that it's rather interesting to see
00:03:02.840 --> 00:03:04.479
how other people are using Emacs
00:03:04.480 --> 00:03:07.079
and what their experience is. So yes, basically,
00:03:07.080 --> 00:03:08.559
you've got utility and interest
00:03:08.560 --> 00:03:10.719
as the two separate driving factors
00:03:10.720 --> 00:03:14.020
as we try to pick questions, which actually can give us
00:03:14.021 --> 00:03:16.520
all of this without taking up too much
00:03:16.521 --> 00:03:18.559
of the respondents time.
NOTE Survey frameworks
00:03:18.560 --> 00:03:24.399
Now, last time in 2020, the Emacs survey that Adrien ran
00:03:24.400 --> 00:03:27.079
used, I think Google Forms, if I recall correctly,
00:03:27.080 --> 00:03:28.799
with an option to send in responses manually.
00:03:28.800 --> 00:03:33.159
This worked, but it's not great,
00:03:33.160 --> 00:03:35.079
particularly given that this is for a survey
00:03:35.080 --> 00:03:37.199
being run in an ardently FOSS community.
00:03:37.200 --> 00:03:38.959
Ideally, we actually want
00:03:38.960 --> 00:03:40.799
to find a survey framework
00:03:40.800 --> 00:03:44.319
that respects the priorities of users, is open source,
00:03:44.320 --> 00:03:46.359
ideally free and open source,
00:03:46.360 --> 00:03:49.999
and is a relatively pleasant experience.
00:03:50.000 --> 00:03:53.079
Unfortunately, looking at available options,
00:03:53.080 --> 00:03:56.879
it seems that one always has to compromise on at least one,
00:03:56.880 --> 00:03:58.020
if not all of those criteria,
00:03:58.021 --> 00:04:01.020
which is quite far from ideal.
NOTE Writing a new survey framework in Julia
00:04:01.021 --> 00:04:04.359
So what's the obvious solution?
00:04:04.360 --> 00:04:06.639
Okay, we should just write a new survey framework.
00:04:06.640 --> 00:04:10.679
Obviously, this is easier said than done.
00:04:10.680 --> 00:04:12.239
But around a year ago,
00:04:12.240 --> 00:04:13.639
I actually started doing exactly this.
00:04:13.640 --> 00:04:17.679
I've used the programming language Julia quite a bit
00:04:17.680 --> 00:04:21.020
on a day to day basis. And there just so happens to be
00:04:21.021 --> 00:04:23.199
a web framework for that called Genie.
00:04:23.200 --> 00:04:24.719
So I thought I'd give it a shot.
00:04:24.720 --> 00:04:26.559
And well, here we are today.
00:04:26.560 --> 00:04:28.479
I ended up putting something together,
00:04:28.480 --> 00:04:34.279
which could take a set of questions written in Julia
00:04:34.280 --> 00:04:35.839
and using a survey library,
00:04:35.840 --> 00:04:38.799
actually pass that into this helpful structure
00:04:38.800 --> 00:04:44.119
and then construct HTML forms based on that,
00:04:44.120 --> 00:04:47.020
and ingest results from the HTML forms,
00:04:47.021 --> 00:04:48.520
and just sort of handle that altogether.
00:04:48.521 --> 00:04:52.439
Now, all of this ends up being fed into an SQLite DB.
00:04:52.440 --> 00:04:55.159
So everything's there, even part responses.
00:04:55.160 --> 00:04:57.599
One of the goals with the actual design of this has been
00:04:57.600 --> 00:05:01.119
to just minimize what's actually done on the client side.
00:05:01.120 --> 00:05:05.559
So that means JavaScript, cookies, the whole lot.
00:05:05.560 --> 00:05:08.759
Basically, as far as this could reasonably be taken,
00:05:08.760 --> 00:05:14.599
we've just got static HTML being shoved to the user,
00:05:14.600 --> 00:05:16.719
or respondent rather. And then we just
00:05:16.720 --> 00:05:18.519
take an HTTP post request back
00:05:18.520 --> 00:05:20.919
and update the results that way.
00:05:20.920 --> 00:05:24.239
Now by doing things like actually paging the survey,
00:05:24.240 --> 00:05:26.559
we can allow for incremental saving of results
00:05:26.560 --> 00:05:30.559
and a few other niceties while essentially preserving
00:05:30.560 --> 00:05:36.319
an experience that doesn't really require any data
00:05:36.320 --> 00:05:37.319
of any particular capabilities, which is sort of
00:05:37.320 --> 00:05:40.199
a nice, clean, minimal experience as far as I'm concerned.
NOTE In practice
00:05:40.200 --> 00:05:45.679
So how does this actually look like in practice?
00:05:45.680 --> 00:05:48.119
Well, one of the nice things about this is
00:05:48.120 --> 00:05:51.479
because the question itself is written in Julia,
00:05:51.480 --> 00:05:54.279
we can get some nice features like custom validators
00:05:54.280 --> 00:05:57.919
and other fancy behavior and directly specify
00:05:57.920 --> 00:06:01.119
how we actually want questions to be registered
00:06:01.120 --> 00:06:04.439
in the database. So here we have, for example,
00:06:04.440 --> 00:06:06.679
two questions we had from this email survey.
00:06:06.680 --> 00:06:09.959
One is a multi-select. Another one is just putting in
00:06:09.960 --> 00:06:14.399
the number of years people have used Emacs for.
00:06:14.400 --> 00:06:16.159
I think this gives a brief overview of the capabilities.
00:06:16.160 --> 00:06:19.599
One of the things I'd like to draw particular attention
00:06:19.600 --> 00:06:20.759
to here is in the multi-select,
00:06:20.760 --> 00:06:22.199
you'll see an array of options,
00:06:22.200 --> 00:06:24.319
the first one of which actually maps for different value
00:06:24.320 --> 00:06:25.879
to be stored for convenience.
00:06:25.880 --> 00:06:29.119
And then the final one is a special one, :other,
00:06:29.120 --> 00:06:30.359
and you can see that's a bit different to the rest
00:06:30.360 --> 00:06:32.599
where it's got that colon function,
00:06:32.600 --> 00:06:33.719
it's a symbol, not a string.
00:06:33.720 --> 00:06:37.639
And this is quite a nice one because the way
00:06:37.640 --> 00:06:39.279
that this framework's been designed,
00:06:39.280 --> 00:06:41.759
when we have an :other value like that,
00:06:41.760 --> 00:06:44.199
instead of it just being a sort of tick box "Other",
00:06:44.200 --> 00:06:47.199
it actually provides the option to write
00:06:47.200 --> 00:06:50.559
your own different response to all of the above.
NOTE Results
00:06:50.560 --> 00:06:55.319
Okay, so at the very end, we've now got
00:06:55.320 --> 00:06:58.519
a completely FOSS survey framework, rather nice.
00:06:58.520 --> 00:07:00.020
So the set of what were these...
00:07:00.021 --> 00:07:01.119
Decent array of input types.
00:07:01.120 --> 00:07:02.639
It would be nice to expand, but at the moment
00:07:02.640 --> 00:07:04.599
I think we could just about describe it as a rich set.
00:07:04.600 --> 00:07:07.159
Zero JavaScript required, but a little bit useful
00:07:07.160 --> 00:07:08.079
for progressive enhancement.
00:07:08.080 --> 00:07:12.759
As demonstrated, we can get some fancy validation going on.
00:07:12.760 --> 00:07:16.679
And then because we've got the results
00:07:16.680 --> 00:07:18.559
tied into this quite nicely,
00:07:18.560 --> 00:07:20.999
we can actually have them available live
00:07:21.000 --> 00:07:22.999
and in quite a number of formats.
00:07:23.000 --> 00:07:25.439
I'm not sure how much you saw in the architecture diagram,
00:07:25.440 --> 00:07:27.079
but we've got all sorts of things here.
00:07:27.080 --> 00:07:29.679
CSV, TSV, plain text, JSON,
00:07:29.680 --> 00:07:32.119
just grab a copy of the SQLite database,
00:07:32.120 --> 00:07:33.319
but only the relevant bits.
00:07:33.320 --> 00:07:35.879
Or something called JLD2,
00:07:35.880 --> 00:07:37.999
which preserves a lot of type information
00:07:38.000 --> 00:07:39.599
and a few other nice things.
NOTE Going forward
00:07:39.600 --> 00:07:43.799
Now, what are we going to do going forward from here?
00:07:43.800 --> 00:07:46.159
Well, there are a few minor issues here.
00:07:46.160 --> 00:07:48.599
For example, there's a memory leak issue which is going on,
00:07:48.600 --> 00:07:51.839
resulting in the service being restarted,
00:07:51.840 --> 00:07:54.519
I think every day or two, while the survey was running.
00:07:54.520 --> 00:07:56.159
I actually have the suspicion
00:07:56.160 --> 00:07:57.639
that that's largely responsible for
00:07:57.640 --> 00:08:01.479
about 1% of respondents, which is about 75 people,
00:08:01.480 --> 00:08:04.399
who described the survey experience as not great.
00:08:04.400 --> 00:08:08.199
Overall though, the feedback has been quite positive.
00:08:08.200 --> 00:08:09.919
There's been some detailed written feedback,
00:08:09.920 --> 00:08:12.799
but just from the quick great/okay/not great options,
00:08:12.800 --> 00:08:14.839
we had about two-thirds of people saying
00:08:14.840 --> 00:08:16.839
that the user experience was great,
00:08:16.840 --> 00:08:19.199
which is really nice to hear the first time being run.
00:08:19.200 --> 00:08:22.839
A few other things would be nice to add, for example,
00:08:22.840 --> 00:08:25.759
in future control flow. By this, I mean
00:08:25.760 --> 00:08:27.879
the option to present different questions
00:08:27.880 --> 00:08:28.999
based on previous answers
00:08:29.000 --> 00:08:31.199
would be quite nice to streamline the experience.
00:08:31.200 --> 00:08:33.519
For example, having a set of questions
00:08:33.520 --> 00:08:37.239
for first-time respondents or people who are involved
00:08:37.240 --> 00:08:42.239
in the packaging side of things
00:08:42.240 --> 00:08:45.079
without actually cluttering the experience
00:08:45.080 --> 00:08:46.039
for everybody else. That'd be quite nice.
00:08:46.040 --> 00:08:48.599
Further to this, all of this,
00:08:48.600 --> 00:08:51.879
I think on top of the standard web interface,
00:08:51.880 --> 00:08:53.599
it'd be quite nice to actually write a server API.
00:08:53.600 --> 00:08:55.520
And the particular reason why I mentioned this
00:08:55.521 --> 00:08:58.020
is because this could potentially allow for
00:08:58.021 --> 00:09:00.359
basically an Emacs survey package.
00:09:00.360 --> 00:09:03.039
I mean, we already use Emacs for so many things,
00:09:03.040 --> 00:09:05.519
might as well fill the survey out from within it as well.
00:09:05.520 --> 00:09:11.159
Okay, so this is how the survey has been conducted.
NOTE Responses
00:09:11.160 --> 00:09:13.679
Now, what are the responses look like?
00:09:13.680 --> 00:09:16.039
Now, at this stage, I was actually hoping
00:09:16.040 --> 00:09:18.919
to get into some somewhat sophisticated analysis
00:09:18.920 --> 00:09:22.599
because there's quite a bit that you can dig out
00:09:22.600 --> 00:09:24.239
of the data responses that we've received.
00:09:24.240 --> 00:09:27.879
However, unfortunately, I've been much more limited on time
00:09:27.880 --> 00:09:30.039
than I'd hoped for, so that's going to have to come later.
00:09:30.040 --> 00:09:33.559
For now, we're just going to take a bit of a peek
00:09:33.560 --> 00:09:35.959
at some of the really basic answers.
00:09:35.960 --> 00:09:38.239
Well, it's not even really analysis.
00:09:38.240 --> 00:09:40.239
Expect to see lots of pie charts, basically.
00:09:40.240 --> 00:09:42.999
But there's still a bit of interest there,
00:09:43.000 --> 00:09:44.359
so we'll go through a bit of that
00:09:44.360 --> 00:09:47.119
and just give a bit of a tease
00:09:47.120 --> 00:09:50.319
as to what might come in the future.
00:09:50.320 --> 00:09:51.919
So to sum up for starters,
00:09:51.920 --> 00:09:55.079
we've had about 6,500 responses.
00:09:55.080 --> 00:09:58.359
It is worth noting that a thousand of those are partials,
00:09:58.360 --> 00:10:02.199
so people who gave up on the survey partway through.
00:10:02.200 --> 00:10:05.399
Given that the 2020 survey had about 7000 responses,
00:10:05.400 --> 00:10:06.999
I'll tell you we're basically on par here.
00:10:07.000 --> 00:10:10.399
This ran over a month and interestingly,
00:10:10.400 --> 00:10:12.239
about half of these respondents
00:10:12.240 --> 00:10:13.799
did not participate in the 2020 survey.
00:10:13.800 --> 00:10:16.199
I think at this point,
00:10:16.200 --> 00:10:17.679
it's not really clear what to make of that.
00:10:17.680 --> 00:10:21.359
There's been a two-year gap between the surveys.
00:10:21.360 --> 00:10:25.159
It's been done, well, it's been done quite differently,
00:10:25.160 --> 00:10:29.639
and yes, there's not enough, really, to say.
00:10:29.640 --> 00:10:31.999
What could be interesting though is actually,
00:10:32.000 --> 00:10:33.839
once this starts running regularly,
00:10:33.840 --> 00:10:36.799
we can see whether there's regular churn
00:10:36.800 --> 00:10:38.520
in the survey respondents,
00:10:38.521 --> 00:10:40.020
or if we have a consistent core
00:10:40.021 --> 00:10:42.020
with people who respond each year,
00:10:42.021 --> 00:10:46.159
and then just people who come by every now and then and go,
00:10:46.160 --> 00:10:47.759
"Oh, why not respond to this year's survey?"
00:10:47.760 --> 00:10:51.479
But we're going to have to wait a bit to actually see
00:10:51.480 --> 00:10:52.759
how people treat the survey.
00:10:52.760 --> 00:10:57.519
Now these responses came from quite a wide range of places
00:10:57.520 --> 00:11:02.519
we've got 115 nations represented here. Collectively,
00:11:02.520 --> 00:11:04.039
these ones have spent about a thousand hours
00:11:04.040 --> 00:11:06.959
giving us information. So I think, if nothing else,
00:11:06.960 --> 00:11:10.479
just from the effort that people have put into
00:11:10.480 --> 00:11:12.879
actually giving us useful data to work with,
00:11:12.880 --> 00:11:13.599
it's worth giving at least a good effort
00:11:13.600 --> 00:11:15.999
to actually trying to extract some value
00:11:16.000 --> 00:11:16.999
out of these responses.
NOTE Geography
00:11:17.000 --> 00:11:20.879
Now, overall we found a lot of responses came from America,
00:11:20.880 --> 00:11:23.199
no surprises there, but as mentioned,
00:11:23.200 --> 00:11:24.020
we've got a good mix around the globe.
00:11:24.021 --> 00:11:29.159
The usual suspects for the rest of the responses,
00:11:29.160 --> 00:11:33.279
a whole bunch in Europe, a whole bunch around Asia,
00:11:33.280 --> 00:11:36.799
a bit in Australasia as well and yes,
00:11:36.800 --> 00:11:38.959
there's nothing particularly surprising here,
00:11:38.960 --> 00:11:41.399
there's a lot of inline expectations.
00:11:41.400 --> 00:11:42.839
What I find a bit more interesting, though,
00:11:42.840 --> 00:11:45.359
is if we actually normalise
00:11:45.360 --> 00:11:48.079
the number of responses from each nation
00:11:48.080 --> 00:11:50.079
by the population of said nations,
00:11:50.080 --> 00:11:54.239
essentially giving a popularity of Emacs
00:11:54.240 --> 00:11:57.359
or at least of Emacs early respondents for each nation,
00:11:57.360 --> 00:12:00.919
we end up finding that Europe, particularly Scandinavia,
00:12:00.920 --> 00:12:02.199
becomes a bit of a hotspot.
00:12:02.200 --> 00:12:04.519
So I'm not sure what's going on
00:12:04.520 --> 00:12:07.319
in Sweden, Finland and Norway,
00:12:07.320 --> 00:12:10.919
but it seems to be particularly popular around there.
00:12:10.920 --> 00:12:14.199
It's also worth noting that we now find
00:12:14.200 --> 00:12:18.319
that the proportion of respondents
00:12:18.320 --> 00:12:21.799
in countries like America, Canada, Australia
00:12:21.800 --> 00:12:24.039
and most of Europe actually becomes
00:12:24.040 --> 00:12:26.399
quite comparable with each other,
00:12:26.400 --> 00:12:30.239
which yes, once again, sort of lines up
00:12:30.240 --> 00:12:32.279
with these responses, expectations from the last slide.
NOTE Gender
00:12:32.280 --> 00:12:36.279
Okay, getting into some of the other
00:12:36.280 --> 00:12:38.599
demographic information.
00:12:38.600 --> 00:12:40.319
The demographic information was new to this survey.
00:12:40.320 --> 00:12:44.479
In the 2020 survey, people were asked what they think
00:12:44.480 --> 00:12:47.199
of being asked about some demographic information
00:12:47.200 --> 00:12:50.199
in a future survey, and the overwhelming response is, "Sure,
00:12:50.200 --> 00:12:52.759
I don't really mind." And so that's what we've done here.
00:12:52.760 --> 00:12:56.279
One of the ones of somewhat interest
00:12:56.280 --> 00:12:59.759
is the age gender breakdown. So we expect Emacs
00:12:59.760 --> 00:13:03.119
to be used predominantly among people in software
00:13:03.120 --> 00:13:05.839
and programming and within the industry,
00:13:05.840 --> 00:13:08.599
I think it's quite widely documented
00:13:08.600 --> 00:13:14.520
to have about a sort of 75-25%, roughly, split
00:13:14.521 --> 00:13:14.759
between male and female.
00:13:14.760 --> 00:13:19.359
Interestingly, in Emacs,
00:13:19.360 --> 00:13:22.879
it's a much more aggressively-biased result.
00:13:22.880 --> 00:13:28.679
So we had about 96% of respondents are male
00:13:28.680 --> 00:13:34.559
with just 4% for the rest. Interestingly, though,
00:13:34.560 --> 00:13:35.359
if we look at the young respondents,
00:13:35.360 --> 00:13:41.719
say for example, under 25, we go from 96% male to 88%.
00:13:41.720 --> 00:13:46.119
So it's fair to say that the young respondents are
00:13:46.120 --> 00:13:49.199
in this respect, a somewhat more diverse group.
00:13:49.200 --> 00:13:52.399
Hopefully, as future surveys go on,
00:13:52.400 --> 00:13:54.399
we'll see this continue not die off
00:13:54.400 --> 00:13:58.719
to the sort of well, at this point,
00:13:58.720 --> 00:14:02.919
it's more like 99% if you look at the older ages.
00:14:02.920 --> 00:14:04.439
But we'll see.
NOTE Occupations
00:14:04.440 --> 00:14:07.919
Occupations was an interesting slide as well.
00:14:07.920 --> 00:14:09.399
Interesting question as well.
00:14:09.400 --> 00:14:11.559
We've got the usual suspects here. I mean,
00:14:11.560 --> 00:14:15.079
it's a text editor, well, Lisp machine
00:14:15.080 --> 00:14:17.639
masquerading as a text editor, mainly used for programming,
00:14:17.640 --> 00:14:20.639
and so we expect lots of software development
00:14:20.640 --> 00:14:23.519
and that sort of thing. But that's only about
00:14:23.520 --> 00:14:25.399
just over half of the responses.
00:14:25.400 --> 00:14:28.679
We've got a huge chunk from academia,
00:14:28.680 --> 00:14:29.999
and then really just an odd bag
00:14:30.000 --> 00:14:30.879
of all sorts of other things,
00:14:30.880 --> 00:14:33.079
including things which you wouldn't really associate
00:14:33.080 --> 00:14:35.359
with programming and software at all.
00:14:35.360 --> 00:14:39.639
Things like creative writing, publishing, legal, yes.
00:14:39.640 --> 00:14:41.719
And then you've got this chunk of Other,
00:14:41.720 --> 00:14:43.239
which is I think here is
00:14:43.240 --> 00:14:46.679
the fourth most popular option here.
00:14:46.680 --> 00:14:49.399
And what we have here is about 500 different responses
00:14:49.400 --> 00:14:51.839
from a huge range of activities.
00:14:51.840 --> 00:14:54.359
It's really quite interesting to read things like
00:14:54.360 --> 00:14:56.919
I think, things like "naval officer",
00:14:56.920 --> 00:15:01.319
and just... All sorts of surprising occupations for Emacs.
00:15:01.320 --> 00:15:04.799
And I think this is a particular area
00:15:04.800 --> 00:15:10.199
because I imagine compared to other code editors,
00:15:10.200 --> 00:15:13.879
sort of your VS Code, remember like
00:15:13.880 --> 00:15:18.959
that Emacs may have a particularly diverse set
00:15:18.960 --> 00:15:23.599
of industry occupations represented in its users.
00:15:23.600 --> 00:15:28.359
Now, if you look at where the response actually came from,
00:15:28.360 --> 00:15:31.039
we've got the usual suspects up top,
00:15:31.040 --> 00:15:33.959
Hacker News and r/emacs.
00:15:33.960 --> 00:15:40.119
But then we actually get a much more graduated breakdown
00:15:40.120 --> 00:15:43.679
than in the 2020 survey.
00:15:43.680 --> 00:15:46.279
We do think familiar results here like IRC, Telegram,
00:15:46.280 --> 00:15:48.639
Emacs China, and Twitter.
00:15:48.640 --> 00:15:50.839
But now you've got a few new entries,
00:15:50.840 --> 00:15:53.519
things like the Fediverse, Discourse, Matrix,
00:15:53.520 --> 00:15:56.119
which didn't pop up previously.
00:15:56.120 --> 00:15:59.079
So I think this is yes, quite a nice sign in terms of
00:15:59.080 --> 00:16:02.520
actually hitting a wide range
00:16:02.521 --> 00:16:05.999
of pockets of Emacs users across different platforms,
00:16:06.000 --> 00:16:10.319
which bodes well for the potential representiveness
00:16:10.320 --> 00:16:11.319
of this survey.
NOTE Free and open source software
00:16:11.320 --> 00:16:15.119
Unsurprisingly, if we're talking about Emacs
00:16:15.120 --> 00:16:17.919
and particularly people who are quite engaged in it,
00:16:17.920 --> 00:16:19.679
which are the respondents to this survey,
00:16:19.680 --> 00:16:25.359
we find that we also get quite a high degree of care
00:16:25.360 --> 00:16:27.479
for free and open source software.
00:16:27.480 --> 00:16:30.519
So if you have a look here,
00:16:30.520 --> 00:16:35.279
only about a quarter of users
00:16:35.280 --> 00:16:39.799
didn't express a strong preference towards FOSS software.
00:16:39.800 --> 00:16:43.759
In fact, we had over a quarter saying that
00:16:43.760 --> 00:16:49.239
they would accept significant or even any compromise
00:16:49.240 --> 00:16:52.199
to use a FOSS user software
00:16:52.200 --> 00:16:55.759
over a proprietary alternative,
00:16:55.760 --> 00:16:59.679
which given the nature of Emacs,
00:16:59.680 --> 00:17:00.639
not terribly surprising,
00:17:00.640 --> 00:17:02.439
but a strong showing nonetheless.
NOTE Emacs versions
00:17:02.440 --> 00:17:05.599
Now, let's start getting to things
00:17:05.600 --> 00:17:07.719
which are actually useful for
00:17:07.720 --> 00:17:11.479
potential Emacs development and packaging.
00:17:11.480 --> 00:17:13.599
If you're thinking about supporting Emacs versions,
00:17:13.600 --> 00:17:16.599
it looks like you can do fantastically well
00:17:16.600 --> 00:17:20.639
in terms of hitting most users if you support Emacs 27+.
00:17:20.640 --> 00:17:23.159
That hits about 96% of respondents.
00:17:23.160 --> 00:17:26.199
Interestingly though, you can actually make an argument
00:17:26.200 --> 00:17:27.119
for being even more aggressive.
00:17:27.120 --> 00:17:30.319
I mean, if you have a look at Emacs 28+,
00:17:30.320 --> 00:17:32.359
that's still over three quarters of respondents.
00:17:32.360 --> 00:17:35.799
We've got, at this point, a quarter
00:17:35.800 --> 00:17:37.279
using the unreleased HEAD version,
00:17:37.280 --> 00:17:40.159
even though it's getting close to release.
00:17:40.160 --> 00:17:43.039
Obviously here, as stated, we're hitting
00:17:43.040 --> 00:17:44.599
a sort of more engaged with the community
00:17:44.600 --> 00:17:47.799
subset of Emacs users, but still,
00:17:47.800 --> 00:17:49.879
I think it's interesting to see that
00:17:49.880 --> 00:17:52.639
with Emacs's increasingly frequent update schedule,
00:17:52.640 --> 00:17:54.999
that users are actually picking up those updates
00:17:55.000 --> 00:17:56.359
quite promptly as they roll out.
NOTE Languages
00:17:56.360 --> 00:18:02.079
Continuing on with how people actually use Emacs: languages.
00:18:02.080 --> 00:18:05.199
We've got the usual suspects here: lots of Python,
00:18:05.200 --> 00:18:08.959
quite a bit of JavaScript and C, lots of shell.
00:18:08.960 --> 00:18:11.879
What I find quite interesting though is
00:18:11.880 --> 00:18:12.799
if we actually bring in
00:18:12.800 --> 00:18:16.719
the 2020 Stack Overflow language usage survey data,
00:18:16.720 --> 00:18:19.239
and that maps quite well
00:18:19.240 --> 00:18:20.079
to the array of language options we provided here.
00:18:20.080 --> 00:18:21.199
They had a general Lisp option,
00:18:21.200 --> 00:18:23.919
which I've folded into Common Lisp
00:18:23.920 --> 00:18:26.919
since they listed Clojure separately.
00:18:26.920 --> 00:18:29.679
I think that seems like a fairly safe bet.
00:18:29.680 --> 00:18:31.919
But other than that, the only languages that we missed
00:18:31.920 --> 00:18:35.839
are Scheme and Elisp.
00:18:35.840 --> 00:18:37.879
What we can do is we can look at
00:18:37.880 --> 00:18:41.199
the relative popularity of different languages
00:18:41.200 --> 00:18:44.519
from our Emacs user survey compared to Stack Overflows.
00:18:44.520 --> 00:18:48.319
What do we find? Well, Clojure and Common Lisp
00:18:48.320 --> 00:18:51.639
far above the rest, I imagine in no small part due to
00:18:51.640 --> 00:18:54.959
the fantastic SLIME and Cider packages.
00:18:54.960 --> 00:18:59.559
Following that, we see Haskell being particularly prominent,
00:18:59.560 --> 00:19:00.639
and then a collection of other languages,
00:19:00.640 --> 00:19:06.199
your Erlang, Elixir, Julia, Perl and the rest.
00:19:06.200 --> 00:19:10.959
And then lastly, if we have a look at the ones
00:19:10.960 --> 00:19:13.439
which have significantly diminished popularity
00:19:13.440 --> 00:19:17.719
compared to Stack Overflow, we end up with, I think,
00:19:17.720 --> 00:19:20.159
what I could probably cast as more enterprising languages.
00:19:20.160 --> 00:19:25.799
Things like C#, Java, Typescript and the like.
NOTE Prose
00:19:25.800 --> 00:19:31.559
So, that's interesting. Now, earlier
00:19:31.560 --> 00:19:33.239
when we were looking at the split of Emacs users,
00:19:33.240 --> 00:19:37.239
we found that we actually had a fair few
00:19:37.240 --> 00:19:42.199
in more creative areas, like writing and publishing.
00:19:42.200 --> 00:19:44.479
So if looking at prose, we'd expect a decent chunk
00:19:44.480 --> 00:19:47.039
to be using Emacs for prose, but it's actually more
00:19:47.040 --> 00:19:48.719
than just a little bit, it's a little slice.
00:19:48.720 --> 00:19:50.599
We've got a whopping about a third of users
00:19:50.600 --> 00:19:54.719
saying they frequently use Emacs for writing prose.
00:19:54.720 --> 00:19:55.999
I'd imagine that the availability
00:19:56.000 --> 00:19:57.799
of things like Org mode and AUCTeX
00:19:57.800 --> 00:20:03.399
probably help like this.
NOTE Packages
00:20:03.400 --> 00:20:05.119
Moving on to other packages, or more packages,
00:20:05.120 --> 00:20:08.879
we've actually got a very similar split here
00:20:08.880 --> 00:20:13.199
to the 2020 survey. Org has seen a bit of a growth
00:20:13.200 --> 00:20:16.039
in popularity. We've got some new arrivals here as well.
00:20:16.040 --> 00:20:18.479
For example, Vertico has popped onto the scene
00:20:18.480 --> 00:20:21.279
and overtaken Ivy here, along with
00:20:21.280 --> 00:20:24.519
a few other new packages like Consult.
00:20:24.520 --> 00:20:27.599
Other than that, quite comparable.
00:20:27.600 --> 00:20:29.999
What's rather interesting, though, I find here is that
00:20:30.000 --> 00:20:33.719
when you have people who listed a small number of packages,
00:20:33.720 --> 00:20:39.439
they actually predominantly listed packages
00:20:39.440 --> 00:20:41.319
other than the most common set.
00:20:41.320 --> 00:20:43.959
So if we have a lot of people who only listed one package,
00:20:43.960 --> 00:20:48.959
basically two-thirds of that,
00:20:48.960 --> 00:20:51.479
or actually three-quarters of those responses
00:20:51.480 --> 00:20:53.879
were saying other packages,
00:20:53.880 --> 00:20:56.279
despite the fact that overall packages
00:20:56.280 --> 00:20:58.599
other than the highlighted selection here
00:20:58.600 --> 00:21:01.399
only constitute a quarter of responses.
00:21:01.400 --> 00:21:04.919
So there might be something a bit more to look at there.
NOTE Documentation
00:21:04.920 --> 00:21:07.799
Now when people are using packages,
00:21:07.800 --> 00:21:11.039
we also asked what types of documentation
00:21:11.040 --> 00:21:14.399
people would like to see more of on package READMEs.
00:21:14.400 --> 00:21:17.159
Basically we've got a big mix here.
00:21:17.160 --> 00:21:20.079
It seems like generally people are interested in
00:21:20.080 --> 00:21:23.839
seeing more in various forms, whether it be tutorials,
00:21:23.840 --> 00:21:29.479
overviews, screenshots, comparisons, or clips and videos.
00:21:29.480 --> 00:21:32.919
So full READMEs with a lot of context
00:21:32.920 --> 00:21:38.439
seem to be quite desirable from this.
NOTE Moving forward
00:21:38.440 --> 00:21:42.359
Now moving forward, what are we going to do?
00:21:42.360 --> 00:21:45.039
So 800 people gave some detailed feedback on the survey.
00:21:45.040 --> 00:21:47.759
That's quite nice. I'm going to be taking a good read
00:21:47.760 --> 00:21:50.799
of all of those responses and use that
00:21:50.800 --> 00:21:55.639
to improve the process and also the set of questions.
00:21:55.640 --> 00:22:00.759
Now all of you can also give some feedback on the questions,
00:22:00.760 --> 00:22:02.679
both that you found most useful in this survey,
00:22:02.680 --> 00:22:04.799
ones that you think might not add much value,
00:22:04.800 --> 00:22:07.039
and/or new questions
00:22:07.040 --> 00:22:08.359
that you think might be a good addition.
00:22:08.360 --> 00:22:11.119
Once I've done a bit more analysis,
00:22:11.120 --> 00:22:13.119
particularly the more sophisticated analysis
00:22:13.120 --> 00:22:17.159
which I'm planning, which will probably come out actually
00:22:17.160 --> 00:22:18.719
maybe in the first quarter of next year,
00:22:18.720 --> 00:22:22.919
we can see which questions there seem to have provided
00:22:22.920 --> 00:22:25.039
the most interesting or surprising results
00:22:25.040 --> 00:22:26.559
and those are probably worth keeping.
00:22:26.560 --> 00:22:31.959
Lastly, once we actually have an API
00:22:31.960 --> 00:22:33.279
and potentially even an Emacs package,
00:22:33.280 --> 00:22:36.159
we could automate a large number of the questions,
00:22:36.160 --> 00:22:38.999
things like Emacs version, set of packages used,
00:22:39.000 --> 00:22:41.039
and that could just streamline the experience
00:22:41.040 --> 00:22:42.279
of actually filling out the survey,
00:22:42.280 --> 00:22:44.199
make it a bit more frictionless.
NOTE Time
00:22:44.200 --> 00:22:47.319
Now talking of the question of questions,
00:22:47.320 --> 00:22:49.319
a quick survey is a good survey.
00:22:49.320 --> 00:22:52.959
If we're asking people to dedicate their time
00:22:52.960 --> 00:22:56.279
to fill out this, it's good to try to get as much value
00:22:56.280 --> 00:22:59.759
without asking them to donate much of their time.
00:22:59.760 --> 00:23:02.399
How has the survey done in this respect?
00:23:02.400 --> 00:23:04.119
I'm actually very happy with how it's done.
00:23:04.120 --> 00:23:06.639
We get a few comments from the feedback saying
00:23:06.640 --> 00:23:07.759
that it was a bit of a long side,
00:23:07.760 --> 00:23:10.759
but the median time was about 12 minutes,
00:23:10.760 --> 00:23:13.759
which doesn't seem too bad, and most commonly
00:23:13.760 --> 00:23:16.399
we saw people completing it in about 8 minutes.
00:23:16.400 --> 00:23:18.879
For a once-per-year survey,
00:23:18.880 --> 00:23:20.519
I think this seems fairly reasonable.
00:23:20.520 --> 00:23:24.279
Getting closer to a 5-10 minute range would be nice,
00:23:24.280 --> 00:23:26.199
but this isn't far off.
NOTE How long the survey is open for
00:23:26.200 --> 00:23:30.879
Lastly, we're also going to be considering
00:23:30.880 --> 00:23:32.719
how long the survey is open for.
00:23:32.720 --> 00:23:36.719
So from the initial opening date,
00:23:36.720 --> 00:23:38.479
what we have here is a plot of
00:23:38.480 --> 00:23:41.919
the page which people ended up on
00:23:41.920 --> 00:23:43.399
and when they started the survey.
00:23:43.400 --> 00:23:46.759
So what we can see is a huge spike in the first few days.
00:23:46.760 --> 00:23:50.239
I've just realised that this plot
00:23:50.240 --> 00:23:53.399
is actually labelled incorrectly.
00:23:53.400 --> 00:23:55.679
Please disregard the minutes to complete the survey.
00:23:55.680 --> 00:23:58.839
This should be days after survey opening
00:23:58.840 --> 00:24:01.519
that a response is actually submitted.
00:24:01.520 --> 00:24:05.399
And what we have here is a big spike
00:24:05.400 --> 00:24:08.679
in popularity in the first week basically,
00:24:08.680 --> 00:24:10.599
and then it trickles down
00:24:10.600 --> 00:24:11.959
to a fairly consistent level after that.
00:24:11.960 --> 00:24:15.839
I'm about to publish a last call for survey responses,
00:24:15.840 --> 00:24:18.279
so I'll see if any final bump happens,
00:24:18.280 --> 00:24:20.039
but this indicates that we can probably just
00:24:20.040 --> 00:24:23.079
have the survey open for a week or two
00:24:23.080 --> 00:24:25.199
and that should be sufficient.
NOTE Plan going forward
00:24:25.200 --> 00:24:30.839
Alright, so what's the general plan going forwards?
00:24:30.840 --> 00:24:35.639
Well, as stated earlier, the idea is to run this annually
00:24:35.640 --> 00:24:38.399
and then consistently improve the questions,
00:24:38.400 --> 00:24:41.039
the experience, and the analysis that's done.
00:24:41.040 --> 00:24:43.559
This year has been the hardest by far
00:24:43.560 --> 00:24:45.839
because a lot had to be set up from scratch.
00:24:45.840 --> 00:24:50.159
The hope is that moving on from here,
00:24:50.160 --> 00:24:51.799
a lot of it can be reused.
00:24:51.800 --> 00:24:54.039
For example, with my comments about
00:24:54.040 --> 00:24:56.439
more sophisticated analysis being down the line,
00:24:56.440 --> 00:24:58.439
once that's all worked out,
00:24:58.440 --> 00:25:00.719
as long as nothing changes too drastically,
00:25:00.720 --> 00:25:03.559
we should be able to reuse a lot of that work
00:25:03.560 --> 00:25:05.759
quite easily in future years.
00:25:05.760 --> 00:25:08.599
Alright, that's it for now.
00:25:08.600 --> 00:25:11.879
Hopefully, you've found this an interesting peek
00:25:11.880 --> 00:25:13.359
into how the survey is operated
00:25:13.360 --> 00:25:15.319
and some of the initial results,
00:25:15.320 --> 00:25:18.919
and hopefully, I'll see you around next year
00:25:18.920 --> 00:25:36.960
for the 2023 survey. Thanks for listening.