WEBVTT 00:00.000 --> 00:09.260 Excellent. Thank you for the great talk. As someone whose first language wasn't English 00:09.260 --> 00:14.960 and speaks other languages, I think localization and internationalization is a very important 00:14.960 --> 00:20.920 topic that's near and dear to my heart, and especially when it comes to Emacs. I think 00:20.920 --> 00:26.700 there's a lot that we could do better. So, yeah, thanks so much. Folks, if you have questions, 00:26.700 --> 00:32.880 you can post them on IRC on the pad, and Jon-Karstof will answer them, and we will also open up 00:32.880 --> 00:37.600 this big blue button for people who would like to join here and ask their questions 00:37.600 --> 00:45.760 directly. Jon-Karstof, please take it away. Okay, thank you. I'm not seeing much activity 00:45.760 --> 00:55.920 on IRC or the pad, so let me add a few things. First, that patch was really interesting in 00:55.920 --> 01:03.680 terms of actually getting into the code and understanding how really can a beginner join 01:03.680 --> 01:11.080 development, even if it's just a few lines. I mentioned in the first part of the presentation 01:11.080 --> 01:17.600 that there was this small integration bug with Mac, and that's the thing that actually 01:17.600 --> 01:22.400 got me started, and that was interesting because at the time I was trying to use Aquamax because 01:22.400 --> 01:28.280 it looked simpler, and I thought, okay, if I need to fix that, rather than fixing it 01:28.280 --> 01:34.400 in Aquamax, maybe I should just go to Emacs and fix it there. So, that was the first attempt 01:34.400 --> 01:40.440 for me to actually contribute something serious, and it was really nice to – I mean, this 01:40.440 --> 01:47.160 Emacs development list is really amazing. 99% of the discussion is just way above your 01:47.160 --> 01:54.120 head, but sometimes you grasp something, and the more you grasp it, the more you understand 01:54.120 --> 02:00.600 and the more you feel like you can actually do something, especially since – I mean, 02:00.600 --> 02:06.640 as for all the free software development projects, most of them, I guess, it's really just do 02:06.640 --> 02:13.920 it kind of thing. And if you try to do something, somebody's going to help you, and what I 02:13.920 --> 02:21.200 really enjoy when being there is that the people are always very nice. Sometimes you 02:21.200 --> 02:28.080 feel some tension when there are discussions about a specific topic, but it's – everybody 02:28.080 --> 02:37.520 is really polite, I mean, 99% of the time. And what I like the most is all the people 02:37.520 --> 02:42.680 are very strong opinionated, so they have a very good idea of what Emacs should be or 02:42.680 --> 02:47.640 should not be, and so it gives you a very good idea of in what direction you should 02:47.640 --> 02:57.400 go. So that experience – I mean, pretty much those 2017, 2018 years were until now 02:57.400 --> 03:02.040 the peak of my Emacs activity. I've had to craddle with that because I was busy with 03:02.040 --> 03:07.160 other things, but I'm really planning to go back to working on maybe not localization 03:07.160 --> 03:13.480 because it's really – it's too big for me right now. And what I was told is that 03:13.480 --> 03:20.520 it involved a bit of C programming and things like this, so I'm not really into that right 03:20.520 --> 03:30.840 now. But I think eventually one day – I just turned 53, so I guess in a few years 03:30.840 --> 03:36.800 from now when I have more time, I guess I'll just dive in and just work on those localization 03:36.800 --> 03:43.800 issues and really to bring Emacs to a different world because I think it's – if we were 03:43.800 --> 03:49.920 able to have – it's a big job. I mean, it's really – if you check the threads 03:49.920 --> 03:55.400 on dev, check my name, you will see that I mostly post on translation or localization 03:55.400 --> 04:01.360 issues at least at the time. And I did an estimate of the sheer volume of strings to 04:01.360 --> 04:10.360 translate. For example, the manuals were about 2 million words. That's big. That's big. 04:10.360 --> 04:14.040 But it's okay. I mean, it's not something that's impossible. And if you check the strings 04:14.040 --> 04:20.160 – that was a really rough estimate. If you check the strings for Emacs proper, not even 04:20.160 --> 04:29.120 talking about the packages and things, I think that would add probably like 500,000 words. 04:29.120 --> 04:34.360 I mean, I have no idea, but my very rough estimate would be that. So it's not something 04:34.360 --> 04:41.120 that's impossible to do. And we'd have to ensure that we have a good process for people 04:41.120 --> 04:46.200 who review the strings and contribute new strings and things like this and also best 04:46.200 --> 04:53.560 practices like what I tried to show in this video. And I was really not trying to be dismissive 04:53.560 --> 04:58.680 about the people who worked on Package L because they did a wonderful job at actually helping 04:58.680 --> 05:02.840 people like me access all those packages. So it's – I mean, the point of the video 05:02.840 --> 05:10.840 is naturally to dismiss the code. But I was kind of scared because I was like, if they 05:10.840 --> 05:18.720 write code like this for strings, then what about the rest of the code? Is it – so it 05:18.720 --> 05:25.560 was kind of – I mean, something that I really can't evaluate. But I'm like – I mean, 05:25.560 --> 05:30.600 those guys obviously are really smart and they're trying to make intelligent things 05:30.600 --> 05:37.400 about how they want to factor their code, et cetera. But if they do that for strings, 05:37.400 --> 05:44.400 which is quite simple actually – I mean, it's simple to mess up strings. So I was 05:44.400 --> 05:50.320 like, what about the rest of the code? Is it that complex or that difficult to understand? 05:50.320 --> 05:56.000 So that's kind of a put off for me. I'm like, I really don't want to try to envisage 05:56.000 --> 06:01.760 that more because – plus it's not – it's really not my area at all. So anyway, that's 06:01.760 --> 06:04.400 what I wanted to add. Yeah. 06:04.400 --> 06:11.680 Awesome. Yeah, I think I pretty much agree with all of what you said. 06:11.680 --> 06:17.360 Yeah, yeah, yeah. I have a question – I see a question on the pad. I use Emacs on 06:17.360 --> 06:23.520 English, but my mother language is – no, no, no. Okay. So the answer is that Emacs 06:23.520 --> 06:33.760 is not localized. And my understanding is that right now it's not localizable. And 06:33.760 --> 06:40.840 those discussions took place about four or five years ago. So check on the dev list and 06:40.840 --> 06:46.280 you'll see the state of the discussion because there is only a discussion at the moment. 06:46.280 --> 06:57.480 What I did for package L, I think it was really just a one-time attempt at fixing one package. 06:57.480 --> 07:05.640 And I did check the other – a number of other packages in core Emacs. And not a lot 07:05.640 --> 07:12.280 of them had – I mean, as far as I checked. And I really did not check everything. But 07:12.280 --> 07:20.840 basically what you have to do is check all the functions that impact strings. And some 07:20.840 --> 07:28.600 are really not user-facing strings, so they're not really interesting for us. And actually, 07:28.600 --> 07:34.640 that's really interesting to do that. So if you just take one list package, list code 07:34.640 --> 07:40.480 and just go through the thing and just check all of print1, printc, message, format, concat 07:40.480 --> 07:43.520 and stuff and just see how it goes. 07:43.520 --> 07:50.240 So basically right now there is no infrastructure to localize the thing. There is no process 07:50.240 --> 07:56.720 to extract the strings. And there is no way to actually import them back into the code. 07:56.720 --> 08:02.800 So what we can do right now is really just what I did, make sure that it's eventually 08:02.800 --> 08:10.760 possible one day. And as I just shown, it's really not such a big deal. If you're very 08:10.760 --> 08:19.800 careful about understanding the way that the strings are handled, it's just a few rewrites 08:19.800 --> 08:24.560 away. I mean, it's really not much. So there's – I mean, there's not a lot to be proud 08:24.560 --> 08:31.140 about in my patch. But it was really fun. And I think it's a very good entry point 08:31.140 --> 08:39.480 for people like us. I suppose – I mean, I suppose the first person question. I mean, 08:39.480 --> 08:44.240 I don't know. Maybe I'm just – I should not suppose that. But people who really enjoy 08:44.240 --> 08:51.320 working in Emacs and just sometimes would like to contribute something and are not programmers 08:51.320 --> 08:56.320 or anything or maybe even programmers. I mean, I'm not excluding them. But that's really 08:56.320 --> 09:02.280 a good way to just start doing something. And eventually from there, you can – I mean, 09:02.280 --> 09:07.020 you just use a package that you like and that you think is important and just check the 09:07.020 --> 09:10.200 strings and do things like this. And then eventually, you'll find other parts of the 09:10.200 --> 09:18.840 code that you want to improve or add functions. So yeah, actually, the patch that I did, this 09:18.840 --> 09:26.840 patch is actually in the process of the thing that I started with Equimax. So I did one 09:26.840 --> 09:35.600 little thing regarding those that were not fully integrated in macOS. And then I did 09:35.600 --> 09:41.880 something about a small function. I think I added the possibility to add an option. 09:41.880 --> 09:48.960 I did documentation improvement as well. So really just little things. And then the deeper 09:48.960 --> 09:53.000 you dive, the more interesting it gets. And then you find something that you really want 09:53.000 --> 10:07.160 to do. So just use that entry point as a way to have fun in Emacs. 10:07.160 --> 10:15.240 Well, so I mentioned Regex on strings. Well, it's not really a red flag for localization. 10:15.240 --> 10:28.080 But the way it's used, I mean, I guess there are ways to properly use it. But I think really 10:28.080 --> 10:38.400 the basically using that means that you're making assumptions on the way language is 10:38.400 --> 10:45.800 structured. And I did exactly the same mistake on a different project that I'm working on. 10:45.800 --> 10:51.280 Actually, I'm in charge of rewriting a manual. And we were using Docbook. And I just thought 10:51.280 --> 10:57.240 it would be smart to have automated links to parts of the chapters, et cetera. And the 10:57.240 --> 11:01.240 thing is that depending on the language, you've got different ways to introduce chapters. 11:01.240 --> 11:10.540 So I should know that. I should know that. You should not automatically insert strings 11:10.540 --> 11:20.720 in code because it's going to produce something that can't be handled by the translator. So 11:20.720 --> 11:28.840 basically Regex on strings is something that probably you might use. But if you see, I 11:28.840 --> 11:33.320 mean, you can see the way it was used in the original code. So if you see something like 11:33.320 --> 11:39.360 that, I mean, just don't run and just fix the thing because there is no way these can 11:39.360 --> 11:44.920 be localized, I mean, extracted properly and then localized. And that's the reason too 11:44.920 --> 11:50.480 why numbers are a big problem because, for example, in English but in French too, we 11:50.480 --> 11:56.920 have only singular forms and plural forms. But some languages have zero forms. Some languages 11:56.920 --> 12:03.720 have two forms like pair forms. Some languages don't have a different form for anything. 12:03.720 --> 12:09.920 For example, I live in Japan. I work in Japanese. And in Japanese, you don't have a form. You 12:09.920 --> 12:16.640 don't have different inflections for words based on their number. So saying one whatever 12:16.640 --> 12:23.400 or two whatevers or an infinity of whatevers or even zero whatever, it's just the same 12:23.400 --> 12:28.480 form. So making assumption on the number of things and the way it's expressed in the language 12:28.480 --> 12:34.640 is usually, and that's something that we already know in free software. I mean, if you check 12:34.640 --> 12:40.060 the getex library, they've got everything sorted out. And that's something that was 12:40.060 --> 12:46.880 created in the 90s at Sun Microsystem. And then it was freed, et cetera. But when you 12:46.880 --> 12:52.560 see the work that it did at the time, you would kind of expect that people understand 12:52.560 --> 12:58.920 that. But no. And that's OK because developers develop and localizers localize. So we kind 12:58.920 --> 13:04.820 of split. But everything has been done already. So we just have to be aware of what's being 13:04.820 --> 13:11.720 done. And we have to be aware of the rules. And I think of one very good set of rules 13:11.720 --> 13:19.880 that's been online for a while. It's the Worldwide Consortium. They have a really good internationalization 13:19.880 --> 13:26.640 page where everything is pretty much black on white on paper, on the web at least. And 13:26.640 --> 13:31.960 if you read that, you can see exactly what should be done for localization, what should 13:31.960 --> 13:35.880 not be done, what should be avoided at all costs, et cetera, et cetera. 13:35.880 --> 13:44.440 So there are plenty of references here and there. And in terms of software localization, 13:44.440 --> 13:49.980 it's the same. If you check the getex page, you should be able to get an idea of what 13:49.980 --> 13:59.240 should be good. So is my project to localize all of Emacs? I wish it were. Eventually I'll 13:59.240 --> 14:05.160 be rich. Hopefully. I don't know. I'm working on that. It's not working well. But the day 14:05.160 --> 14:11.540 I can take just one year off totally and focus on that, I think that's something I would 14:11.540 --> 14:18.760 love to work on and just get up to speed with the process of programming all the things, 14:18.760 --> 14:23.080 checking all the things, and organizing the infrastructure. But seriously, I don't think 14:23.080 --> 14:31.240 that will ever happen because I'm a poor translator. And I still have, what, like 20 years to go 14:31.240 --> 14:40.560 before I can't work anymore. And we don't have savings or anything with the corona shit. 14:40.560 --> 14:47.560 So I don't think that's ever going to happen. But I would love to help. And yes, yes. How 14:47.560 --> 14:53.480 deep would useful localization go? Because the core of Emacs are duck strings and localization. 14:53.480 --> 15:00.280 Yes, yes, yes. I mean, all those discussions have been made. I mean, no conclusion reached. 15:00.280 --> 15:07.880 But we have addressed those things on the discussions. And so just, I mean, it's really 15:07.880 --> 15:13.560 pretentious to say, check my name on the Emacs table list because I've talked about that. 15:13.560 --> 15:18.680 It's really pretentious. But that's not what I'm saying. I mean, there has been a lot of 15:18.680 --> 15:24.400 discussion on the development list. So if you check for localization, translation, stuff 15:24.400 --> 15:30.800 like that, you'll see keywords, and you'll see the discussion. And people are aware of 15:30.800 --> 15:36.440 the issues. So I mean, we just need to have a framework for that. 15:36.440 --> 15:40.120 Thank you. Just to quickly chime in to say, I think we have about two more minutes of 15:40.120 --> 15:45.800 on stream Q&A. And then you're welcome to either stay here, Jean-Christophe, or continue 15:45.800 --> 15:48.800 taking questions on the pad on IRC. 15:48.800 --> 15:57.120 I think, well, I got to go to work. So I need to get ready. But I think, unless we have 15:57.120 --> 16:08.760 something on IRC, I think we're good. If you find something else that I've not addressed, 16:08.760 --> 16:19.840 I'm good. Otherwise, yes, yes, yeah, we need to take all the C code. But I mean, you can 16:19.840 --> 16:29.160 decide the level down to which you want to work. So you can go all the way to the C code. 16:29.160 --> 16:32.920 But actually, the C code is actually easier to extract because there is all these get 16:32.920 --> 16:40.280 text things that works on the C code already. So the issue is pretty much the Emacs Lisp 16:40.280 --> 16:47.760 code, as far as I can understand. So that would be the process that we need to address. 16:47.760 --> 16:56.800 Doc strings, indeed. But then the doc strings and the manual, they are very close. And actually, 16:56.800 --> 17:03.560 yeah, my estimate of the 500,000 word, I think it was based on doc strings. So yeah, we need 17:03.560 --> 17:09.760 to take all that. And that's an ongoing project that's not going to go away anyway. So we'll 17:09.760 --> 17:12.760 be here 10 years from now, I'm sure. 17:12.760 --> 17:17.680 OK, cool. And yeah, I think that's about all the time that we have on the stream. I guess 17:17.680 --> 17:21.720 if folks have further questions, they could maybe reach out to you later on IRC or via 17:21.720 --> 17:22.720 email. 17:22.720 --> 17:29.640 And I'll be back on the development list shortly, maybe six months from now. So yeah, I can 17:29.640 --> 17:30.640 take it from there. 17:30.640 --> 17:31.640 Sounds great. 17:31.640 --> 17:32.640 Thank you very much. 17:32.640 --> 17:33.640 Thank you very much. 17:33.640 --> 17:34.640 Yeah, thanks again for your great talk. Cheers. 17:34.640 --> 17:35.640 Cheers. 17:35.640 --> 17:56.640 OK, bye.