summaryrefslogtreecommitdiffstats
path: root/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--answers.vtt
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--answers.vtt497
1 files changed, 497 insertions, 0 deletions
diff --git a/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--answers.vtt b/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--answers.vtt
new file mode 100644
index 00000000..a6a4ba40
--- /dev/null
+++ b/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--answers.vtt
@@ -0,0 +1,497 @@
+WEBVTT
+
+00:00.000 --> 00:09.260
+Excellent. Thank you for the great talk. As someone whose first language wasn't English
+
+00:09.260 --> 00:14.960
+and speaks other languages, I think localization and internationalization is a very important
+
+00:14.960 --> 00:20.920
+topic that's near and dear to my heart, and especially when it comes to Emacs. I think
+
+00:20.920 --> 00:26.700
+there's a lot that we could do better. So, yeah, thanks so much. Folks, if you have questions,
+
+00:26.700 --> 00:32.880
+you can post them on IRC on the pad, and Jon-Karstof will answer them, and we will also open up
+
+00:32.880 --> 00:37.600
+this big blue button for people who would like to join here and ask their questions
+
+00:37.600 --> 00:45.760
+directly. Jon-Karstof, please take it away. Okay, thank you. I'm not seeing much activity
+
+00:45.760 --> 00:55.920
+on IRC or the pad, so let me add a few things. First, that patch was really interesting in
+
+00:55.920 --> 01:03.680
+terms of actually getting into the code and understanding how really can a beginner join
+
+01:03.680 --> 01:11.080
+development, even if it's just a few lines. I mentioned in the first part of the presentation
+
+01:11.080 --> 01:17.600
+that there was this small integration bug with Mac, and that's the thing that actually
+
+01:17.600 --> 01:22.400
+got me started, and that was interesting because at the time I was trying to use Aquamax because
+
+01:22.400 --> 01:28.280
+it looked simpler, and I thought, okay, if I need to fix that, rather than fixing it
+
+01:28.280 --> 01:34.400
+in Aquamax, maybe I should just go to Emacs and fix it there. So, that was the first attempt
+
+01:34.400 --> 01:40.440
+for me to actually contribute something serious, and it was really nice to – I mean, this
+
+01:40.440 --> 01:47.160
+Emacs development list is really amazing. 99% of the discussion is just way above your
+
+01:47.160 --> 01:54.120
+head, but sometimes you grasp something, and the more you grasp it, the more you understand
+
+01:54.120 --> 02:00.600
+and the more you feel like you can actually do something, especially since – I mean,
+
+02:00.600 --> 02:06.640
+as for all the free software development projects, most of them, I guess, it's really just do
+
+02:06.640 --> 02:13.920
+it kind of thing. And if you try to do something, somebody's going to help you, and what I
+
+02:13.920 --> 02:21.200
+really enjoy when being there is that the people are always very nice. Sometimes you
+
+02:21.200 --> 02:28.080
+feel some tension when there are discussions about a specific topic, but it's – everybody
+
+02:28.080 --> 02:37.520
+is really polite, I mean, 99% of the time. And what I like the most is all the people
+
+02:37.520 --> 02:42.680
+are very strong opinionated, so they have a very good idea of what Emacs should be or
+
+02:42.680 --> 02:47.640
+should not be, and so it gives you a very good idea of in what direction you should
+
+02:47.640 --> 02:57.400
+go. So that experience – I mean, pretty much those 2017, 2018 years were until now
+
+02:57.400 --> 03:02.040
+the peak of my Emacs activity. I've had to craddle with that because I was busy with
+
+03:02.040 --> 03:07.160
+other things, but I'm really planning to go back to working on maybe not localization
+
+03:07.160 --> 03:13.480
+because it's really – it's too big for me right now. And what I was told is that
+
+03:13.480 --> 03:20.520
+it involved a bit of C programming and things like this, so I'm not really into that right
+
+03:20.520 --> 03:30.840
+now. But I think eventually one day – I just turned 53, so I guess in a few years
+
+03:30.840 --> 03:36.800
+from now when I have more time, I guess I'll just dive in and just work on those localization
+
+03:36.800 --> 03:43.800
+issues and really to bring Emacs to a different world because I think it's – if we were
+
+03:43.800 --> 03:49.920
+able to have – it's a big job. I mean, it's really – if you check the threads
+
+03:49.920 --> 03:55.400
+on dev, check my name, you will see that I mostly post on translation or localization
+
+03:55.400 --> 04:01.360
+issues at least at the time. And I did an estimate of the sheer volume of strings to
+
+04:01.360 --> 04:10.360
+translate. For example, the manuals were about 2 million words. That's big. That's big.
+
+04:10.360 --> 04:14.040
+But it's okay. I mean, it's not something that's impossible. And if you check the strings
+
+04:14.040 --> 04:20.160
+– that was a really rough estimate. If you check the strings for Emacs proper, not even
+
+04:20.160 --> 04:29.120
+talking about the packages and things, I think that would add probably like 500,000 words.
+
+04:29.120 --> 04:34.360
+I mean, I have no idea, but my very rough estimate would be that. So it's not something
+
+04:34.360 --> 04:41.120
+that's impossible to do. And we'd have to ensure that we have a good process for people
+
+04:41.120 --> 04:46.200
+who review the strings and contribute new strings and things like this and also best
+
+04:46.200 --> 04:53.560
+practices like what I tried to show in this video. And I was really not trying to be dismissive
+
+04:53.560 --> 04:58.680
+about the people who worked on Package L because they did a wonderful job at actually helping
+
+04:58.680 --> 05:02.840
+people like me access all those packages. So it's – I mean, the point of the video
+
+05:02.840 --> 05:10.840
+is naturally to dismiss the code. But I was kind of scared because I was like, if they
+
+05:10.840 --> 05:18.720
+write code like this for strings, then what about the rest of the code? Is it – so it
+
+05:18.720 --> 05:25.560
+was kind of – I mean, something that I really can't evaluate. But I'm like – I mean,
+
+05:25.560 --> 05:30.600
+those guys obviously are really smart and they're trying to make intelligent things
+
+05:30.600 --> 05:37.400
+about how they want to factor their code, et cetera. But if they do that for strings,
+
+05:37.400 --> 05:44.400
+which is quite simple actually – I mean, it's simple to mess up strings. So I was
+
+05:44.400 --> 05:50.320
+like, what about the rest of the code? Is it that complex or that difficult to understand?
+
+05:50.320 --> 05:56.000
+So that's kind of a put off for me. I'm like, I really don't want to try to envisage
+
+05:56.000 --> 06:01.760
+that more because – plus it's not – it's really not my area at all. So anyway, that's
+
+06:01.760 --> 06:04.400
+what I wanted to add. Yeah.
+
+06:04.400 --> 06:11.680
+Awesome. Yeah, I think I pretty much agree with all of what you said.
+
+06:11.680 --> 06:17.360
+Yeah, yeah, yeah. I have a question – I see a question on the pad. I use Emacs on
+
+06:17.360 --> 06:23.520
+English, but my mother language is – no, no, no. Okay. So the answer is that Emacs
+
+06:23.520 --> 06:33.760
+is not localized. And my understanding is that right now it's not localizable. And
+
+06:33.760 --> 06:40.840
+those discussions took place about four or five years ago. So check on the dev list and
+
+06:40.840 --> 06:46.280
+you'll see the state of the discussion because there is only a discussion at the moment.
+
+06:46.280 --> 06:57.480
+What I did for package L, I think it was really just a one-time attempt at fixing one package.
+
+06:57.480 --> 07:05.640
+And I did check the other – a number of other packages in core Emacs. And not a lot
+
+07:05.640 --> 07:12.280
+of them had – I mean, as far as I checked. And I really did not check everything. But
+
+07:12.280 --> 07:20.840
+basically what you have to do is check all the functions that impact strings. And some
+
+07:20.840 --> 07:28.600
+are really not user-facing strings, so they're not really interesting for us. And actually,
+
+07:28.600 --> 07:34.640
+that's really interesting to do that. So if you just take one list package, list code
+
+07:34.640 --> 07:40.480
+and just go through the thing and just check all of print1, printc, message, format, concat
+
+07:40.480 --> 07:43.520
+and stuff and just see how it goes.
+
+07:43.520 --> 07:50.240
+So basically right now there is no infrastructure to localize the thing. There is no process
+
+07:50.240 --> 07:56.720
+to extract the strings. And there is no way to actually import them back into the code.
+
+07:56.720 --> 08:02.800
+So what we can do right now is really just what I did, make sure that it's eventually
+
+08:02.800 --> 08:10.760
+possible one day. And as I just shown, it's really not such a big deal. If you're very
+
+08:10.760 --> 08:19.800
+careful about understanding the way that the strings are handled, it's just a few rewrites
+
+08:19.800 --> 08:24.560
+away. I mean, it's really not much. So there's – I mean, there's not a lot to be proud
+
+08:24.560 --> 08:31.140
+about in my patch. But it was really fun. And I think it's a very good entry point
+
+08:31.140 --> 08:39.480
+for people like us. I suppose – I mean, I suppose the first person question. I mean,
+
+08:39.480 --> 08:44.240
+I don't know. Maybe I'm just – I should not suppose that. But people who really enjoy
+
+08:44.240 --> 08:51.320
+working in Emacs and just sometimes would like to contribute something and are not programmers
+
+08:51.320 --> 08:56.320
+or anything or maybe even programmers. I mean, I'm not excluding them. But that's really
+
+08:56.320 --> 09:02.280
+a good way to just start doing something. And eventually from there, you can – I mean,
+
+09:02.280 --> 09:07.020
+you just use a package that you like and that you think is important and just check the
+
+09:07.020 --> 09:10.200
+strings and do things like this. And then eventually, you'll find other parts of the
+
+09:10.200 --> 09:18.840
+code that you want to improve or add functions. So yeah, actually, the patch that I did, this
+
+09:18.840 --> 09:26.840
+patch is actually in the process of the thing that I started with Equimax. So I did one
+
+09:26.840 --> 09:35.600
+little thing regarding those that were not fully integrated in macOS. And then I did
+
+09:35.600 --> 09:41.880
+something about a small function. I think I added the possibility to add an option.
+
+09:41.880 --> 09:48.960
+I did documentation improvement as well. So really just little things. And then the deeper
+
+09:48.960 --> 09:53.000
+you dive, the more interesting it gets. And then you find something that you really want
+
+09:53.000 --> 10:07.160
+to do. So just use that entry point as a way to have fun in Emacs.
+
+10:07.160 --> 10:15.240
+Well, so I mentioned Regex on strings. Well, it's not really a red flag for localization.
+
+10:15.240 --> 10:28.080
+But the way it's used, I mean, I guess there are ways to properly use it. But I think really
+
+10:28.080 --> 10:38.400
+the basically using that means that you're making assumptions on the way language is
+
+10:38.400 --> 10:45.800
+structured. And I did exactly the same mistake on a different project that I'm working on.
+
+10:45.800 --> 10:51.280
+Actually, I'm in charge of rewriting a manual. And we were using Docbook. And I just thought
+
+10:51.280 --> 10:57.240
+it would be smart to have automated links to parts of the chapters, et cetera. And the
+
+10:57.240 --> 11:01.240
+thing is that depending on the language, you've got different ways to introduce chapters.
+
+11:01.240 --> 11:10.540
+So I should know that. I should know that. You should not automatically insert strings
+
+11:10.540 --> 11:20.720
+in code because it's going to produce something that can't be handled by the translator. So
+
+11:20.720 --> 11:28.840
+basically Regex on strings is something that probably you might use. But if you see, I
+
+11:28.840 --> 11:33.320
+mean, you can see the way it was used in the original code. So if you see something like
+
+11:33.320 --> 11:39.360
+that, I mean, just don't run and just fix the thing because there is no way these can
+
+11:39.360 --> 11:44.920
+be localized, I mean, extracted properly and then localized. And that's the reason too
+
+11:44.920 --> 11:50.480
+why numbers are a big problem because, for example, in English but in French too, we
+
+11:50.480 --> 11:56.920
+have only singular forms and plural forms. But some languages have zero forms. Some languages
+
+11:56.920 --> 12:03.720
+have two forms like pair forms. Some languages don't have a different form for anything.
+
+12:03.720 --> 12:09.920
+For example, I live in Japan. I work in Japanese. And in Japanese, you don't have a form. You
+
+12:09.920 --> 12:16.640
+don't have different inflections for words based on their number. So saying one whatever
+
+12:16.640 --> 12:23.400
+or two whatevers or an infinity of whatevers or even zero whatever, it's just the same
+
+12:23.400 --> 12:28.480
+form. So making assumption on the number of things and the way it's expressed in the language
+
+12:28.480 --> 12:34.640
+is usually, and that's something that we already know in free software. I mean, if you check
+
+12:34.640 --> 12:40.060
+the getex library, they've got everything sorted out. And that's something that was
+
+12:40.060 --> 12:46.880
+created in the 90s at Sun Microsystem. And then it was freed, et cetera. But when you
+
+12:46.880 --> 12:52.560
+see the work that it did at the time, you would kind of expect that people understand
+
+12:52.560 --> 12:58.920
+that. But no. And that's OK because developers develop and localizers localize. So we kind
+
+12:58.920 --> 13:04.820
+of split. But everything has been done already. So we just have to be aware of what's being
+
+13:04.820 --> 13:11.720
+done. And we have to be aware of the rules. And I think of one very good set of rules
+
+13:11.720 --> 13:19.880
+that's been online for a while. It's the Worldwide Consortium. They have a really good internationalization
+
+13:19.880 --> 13:26.640
+page where everything is pretty much black on white on paper, on the web at least. And
+
+13:26.640 --> 13:31.960
+if you read that, you can see exactly what should be done for localization, what should
+
+13:31.960 --> 13:35.880
+not be done, what should be avoided at all costs, et cetera, et cetera.
+
+13:35.880 --> 13:44.440
+So there are plenty of references here and there. And in terms of software localization,
+
+13:44.440 --> 13:49.980
+it's the same. If you check the getex page, you should be able to get an idea of what
+
+13:49.980 --> 13:59.240
+should be good. So is my project to localize all of Emacs? I wish it were. Eventually I'll
+
+13:59.240 --> 14:05.160
+be rich. Hopefully. I don't know. I'm working on that. It's not working well. But the day
+
+14:05.160 --> 14:11.540
+I can take just one year off totally and focus on that, I think that's something I would
+
+14:11.540 --> 14:18.760
+love to work on and just get up to speed with the process of programming all the things,
+
+14:18.760 --> 14:23.080
+checking all the things, and organizing the infrastructure. But seriously, I don't think
+
+14:23.080 --> 14:31.240
+that will ever happen because I'm a poor translator. And I still have, what, like 20 years to go
+
+14:31.240 --> 14:40.560
+before I can't work anymore. And we don't have savings or anything with the corona shit.
+
+14:40.560 --> 14:47.560
+So I don't think that's ever going to happen. But I would love to help. And yes, yes. How
+
+14:47.560 --> 14:53.480
+deep would useful localization go? Because the core of Emacs are duck strings and localization.
+
+14:53.480 --> 15:00.280
+Yes, yes, yes. I mean, all those discussions have been made. I mean, no conclusion reached.
+
+15:00.280 --> 15:07.880
+But we have addressed those things on the discussions. And so just, I mean, it's really
+
+15:07.880 --> 15:13.560
+pretentious to say, check my name on the Emacs table list because I've talked about that.
+
+15:13.560 --> 15:18.680
+It's really pretentious. But that's not what I'm saying. I mean, there has been a lot of
+
+15:18.680 --> 15:24.400
+discussion on the development list. So if you check for localization, translation, stuff
+
+15:24.400 --> 15:30.800
+like that, you'll see keywords, and you'll see the discussion. And people are aware of
+
+15:30.800 --> 15:36.440
+the issues. So I mean, we just need to have a framework for that.
+
+15:36.440 --> 15:40.120
+Thank you. Just to quickly chime in to say, I think we have about two more minutes of
+
+15:40.120 --> 15:45.800
+on stream Q&A. And then you're welcome to either stay here, Jean-Christophe, or continue
+
+15:45.800 --> 15:48.800
+taking questions on the pad on IRC.
+
+15:48.800 --> 15:57.120
+I think, well, I got to go to work. So I need to get ready. But I think, unless we have
+
+15:57.120 --> 16:08.760
+something on IRC, I think we're good. If you find something else that I've not addressed,
+
+16:08.760 --> 16:19.840
+I'm good. Otherwise, yes, yes, yeah, we need to take all the C code. But I mean, you can
+
+16:19.840 --> 16:29.160
+decide the level down to which you want to work. So you can go all the way to the C code.
+
+16:29.160 --> 16:32.920
+But actually, the C code is actually easier to extract because there is all these get
+
+16:32.920 --> 16:40.280
+text things that works on the C code already. So the issue is pretty much the Emacs Lisp
+
+16:40.280 --> 16:47.760
+code, as far as I can understand. So that would be the process that we need to address.
+
+16:47.760 --> 16:56.800
+Doc strings, indeed. But then the doc strings and the manual, they are very close. And actually,
+
+16:56.800 --> 17:03.560
+yeah, my estimate of the 500,000 word, I think it was based on doc strings. So yeah, we need
+
+17:03.560 --> 17:09.760
+to take all that. And that's an ongoing project that's not going to go away anyway. So we'll
+
+17:09.760 --> 17:12.760
+be here 10 years from now, I'm sure.
+
+17:12.760 --> 17:17.680
+OK, cool. And yeah, I think that's about all the time that we have on the stream. I guess
+
+17:17.680 --> 17:21.720
+if folks have further questions, they could maybe reach out to you later on IRC or via
+
+17:21.720 --> 17:22.720
+email.
+
+17:22.720 --> 17:29.640
+And I'll be back on the development list shortly, maybe six months from now. So yeah, I can
+
+17:29.640 --> 17:30.640
+take it from there.
+
+17:30.640 --> 17:31.640
+Sounds great.
+
+17:31.640 --> 17:32.640
+Thank you very much.
+
+17:32.640 --> 17:33.640
+Thank you very much.
+
+17:33.640 --> 17:34.640
+Yeah, thanks again for your great talk. Cheers.
+
+17:34.640 --> 17:35.640
+Cheers.
+
+17:35.640 --> 17:56.640
+OK, bye.
+