summaryrefslogtreecommitdiffstats
path: root/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt
diff options
context:
space:
mode:
authorSacha Chua <sacha@sachachua.com>2022-12-04 16:00:30 -0500
committerSacha Chua <sacha@sachachua.com>2022-12-04 16:00:30 -0500
commit57ed51229a2531f7307fa6fc0864ca0694cd1c39 (patch)
treeb9b1025266619d9e4823a77a6e1b986c85619231 /2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt
parentea491226615582119fa22cb3fe7297ab2f5d8de3 (diff)
downloademacsconf-wiki-57ed51229a2531f7307fa6fc0864ca0694cd1c39.tar.xz
emacsconf-wiki-57ed51229a2531f7307fa6fc0864ca0694cd1c39.zip
Automated commit
Diffstat (limited to '2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt')
-rw-r--r--2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt726
1 files changed, 726 insertions, 0 deletions
diff --git a/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt b/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt
new file mode 100644
index 00000000..a86af897
--- /dev/null
+++ b/2022/captions/emacsconf-2022-localizing--prelocalizing-emacs--jeanchristophe-helary--main.vtt
@@ -0,0 +1,726 @@
+WEBVTT captioned by brandelune and bhavin192
+
+NOTE Introduction
+
+00:00.000 --> 00:00:05.400
+Hello everyone, I am Jean-Christophe Helary,
+
+00:00:05.400 --> 00:00:09.680
+I live in Japan, and I'm a translator.
+
+00:09.680 --> 00:00:12.633
+Here is my second presentation on this very
+
+00:00:12.633 --> 00:00:15.300
+prestigious stage that is the Emacs conference.
+
+00:00:15.300 --> 00:00:18.367
+Following my "Let's Translate the 2 million words
+
+00:00:18.367 --> 00:00:21.767
+in the Emacs manual" in 2021, my topic this year,
+
+00:00:21.767 --> 00:00:25.167
+always related to translation, is
+
+00:00:25.167 --> 00:00:28.400
+pre-localizing Emacs or much less pretentiously,
+
+00:00:28.400 --> 00:00:31.933
+"Just make sure that your strings don't mix up plurals".
+
+NOTE Usage of package.el
+
+00:00:31.933 --> 00:00:36.133
+So, for some reason I resumed Emacs use
+
+00:00:36.133 --> 00:00:39.940
+around 2016, and as I was rediscovering the thing
+
+00:00:39.940 --> 00:00:42.800
+I found really old outline-mode files here
+
+00:00:42.800 --> 00:00:44.033
+and there on my machine.
+
+00:00:44.033 --> 00:00:45.140
+And I started to experiment
+
+00:00:45.140 --> 00:00:47.167
+again and write again with Emacs.
+
+00:00:47.167 --> 00:00:48.564
+I think that at the time,
+
+00:00:48.564 --> 00:00:50.433
+I was coming from Aquamacs and because of
+
+00:00:50.433 --> 00:00:53.400
+an integration bug with macOS, I decided
+
+00:00:53.400 --> 00:00:55.440
+to check what was going on in the code.
+
+00:55.440 --> 00:00:59.040
+That was my first official contribution.
+
+NOTE The bug in strings
+
+00:59.040 --> 00:01:02.233
+So as I was happily installing and uninstalling
+
+00:01:02.233 --> 00:01:05.267
+things, I noticed something weird one day.
+
+00:01:05.267 --> 00:01:09.080
+Let me enlarge that picture.
+
+01:09.080 --> 00:01:12.400
+See? And even if I were not a translator,
+
+00:01:12.400 --> 00:01:14.960
+I would not like that string, and obviously
+
+01:14.960 --> 00:01:16.833
+the same bug bites you when the string
+
+00:01:16.833 --> 00:01:20.520
+tells you to erase the package.
+
+01:20.520 --> 00:01:26.720
+Boom, so we agree that we have a problem here.
+
+NOTE Natural language engineering
+
+01:26.720 --> 00:01:29.067
+So, I started to do some spelunking into the code,
+
+00:01:29.067 --> 00:01:31.067
+and at least that was my feeling
+
+00:01:31.067 --> 00:01:33.100
+because I really am not a programmer
+
+00:01:33.100 --> 00:01:37.240
+by any stretch of the imagination.
+
+01:37.240 --> 00:01:39.467
+And what I found was an amazing piece of
+
+00:01:39.467 --> 00:01:41.840
+natural language engineering that was mixing code
+
+01:41.840 --> 00:01:44.267
+with English suffixes and all that,
+
+00:01:44.267 --> 00:01:46.267
+and I could see that the people who had
+
+00:01:46.267 --> 00:01:47.767
+written that code were pretty smart,
+
+00:01:47.767 --> 00:01:49.533
+but had missed a number of edge cases
+
+00:01:49.533 --> 00:01:51.280
+that produced the above bugs.
+
+01:51.280 --> 00:01:53.500
+That was my first experience with
+
+00:01:53.500 --> 00:01:55.033
+all the message related functions,
+
+00:01:55.033 --> 00:01:58.360
+"format", "concat", "message", etc.
+
+01:58.360 --> 00:02:00.433
+But even with my beginner's eyes I could see that
+
+00:02:00.433 --> 00:02:03.040
+something was off because when you want
+
+02:03.040 --> 00:02:06.000
+to produce natural language strings you never ever
+
+00:02:06.000 --> 00:02:08.600
+should use "replace-regex-in-string" to
+
+02:08.600 --> 00:02:11.067
+add an "ing" or an "ed" suffix
+
+00:02:11.067 --> 00:02:12.980
+to change the mode of a sentence.
+
+02:12.980 --> 00:02:16.840
+But that's what I was seeing was happening.
+
+NOTE More than a missed plural
+
+02:16.840 --> 00:02:20.333
+So, what we had to deal with here
+
+00:02:20.333 --> 00:02:22.220
+was way more than just a missed plural.
+
+02:22.220 --> 00:02:24.000
+It was an attempt at engineering all
+
+00:02:24.000 --> 00:02:26.400
+the message strings destined to the user
+
+00:02:26.400 --> 00:02:28.567
+with the smart code that was making assumptions
+
+00:02:28.567 --> 00:02:30.067
+on the structure of words,
+
+00:02:30.067 --> 00:02:33.220
+and in the localization world that's a big no-no.
+
+02:33.220 --> 00:02:36.667
+I'm a translator, and such UI strings issues
+
+00:02:36.667 --> 00:02:38.433
+have been sorted out decades ago.
+
+00:02:38.433 --> 00:02:41.320
+So I was a bit shocked.
+
+NOTE The final patch
+
+02:41.320 --> 00:02:43.533
+The final patch took me about a year to write,
+
+00:02:43.533 --> 00:02:45.380
+because I'm slow, because I needed to verify
+
+02:45.380 --> 00:02:47.167
+and understand a lot, because there are
+
+00:02:47.167 --> 00:02:49.100
+plenty of rules and plenty of people who are
+
+00:02:49.100 --> 00:02:51.433
+explaining you very nicely what the rules are,
+
+00:02:51.433 --> 00:02:53.733
+because I have kids, and because the
+
+00:02:53.733 --> 00:02:55.600
+Emacs development list is such a cool place to be
+
+00:02:55.600 --> 00:02:58.560
+that you often forget why you're there sometimes.
+
+02:58.560 --> 00:03:01.800
+Anyway, for people who can't click on a video,
+
+00:03:01.800 --> 00:03:03.640
+and I can't either, here are the relevant
+
+03:03.640 --> 00:03:05.840
+parts with some short comments.
+
+03:05.840 --> 00:03:07.800
+I'll be talking with localization in mind,
+
+00:03:07.800 --> 00:03:09.640
+knowing full well that Emacs localization
+
+03:09.640 --> 00:03:12.800
+is not on the map at the moment.
+
+03:12.800 --> 00:03:14.167
+So first, there is this thing
+
+00:03:14.167 --> 00:03:15.520
+about "format" and "concat".
+
+03:15.520 --> 00:03:17.800
+And if I remember correctly,
+
+00:03:17.800 --> 00:03:20.300
+"format" is better for user-facing things,
+
+00:03:20.300 --> 00:03:25.160
+and "concat" is better for internal things.
+
+03:25.160 --> 00:03:26.800
+Here, there are two things.
+
+03:26.800 --> 00:03:28.800
+First, a rule that we have when we prepare
+
+00:03:28.800 --> 00:03:30.700
+strings that need to be localized is
+
+00:03:30.700 --> 00:03:33.333
+never ever make assumptions on the way
+
+00:03:33.333 --> 00:03:35.780
+numbers are expressed in the language.
+
+03:35.780 --> 00:03:37.067
+Here, the assumption is that
+
+00:03:37.067 --> 00:03:40.000
+we have either a singular or plural form,
+
+00:03:40.000 --> 00:03:42.040
+and that's not always the case.
+
+03:42.040 --> 00:03:44.067
+That usually means that you should externalize
+
+00:03:44.067 --> 00:03:48.280
+numbers and find a generic way to express them.
+
+03:48.280 --> 00:03:50.833
+So it makes for slightly less natural
+
+00:03:50.833 --> 00:03:54.400
+language strings, but it's better anyway.
+
+03:54.400 --> 00:03:56.667
+Then we have that comma there that's trying
+
+00:03:56.667 --> 00:03:58.167
+to be externalized and that's weird,
+
+00:03:58.167 --> 00:04:02.620
+so I put it back into the sentence.
+
+04:02.620 --> 00:04:04.967
+Here we have another construct, or two rather,
+
+00:04:04.967 --> 00:04:06.960
+that really should not be used like this.
+
+04:06.960 --> 00:04:10.033
+It's "prin1" that uses quoting characters,
+
+00:04:10.033 --> 00:04:12.480
+just like "print", and "princ" that does not.
+
+04:12.480 --> 00:04:15.400
+And you see why they were combined together.
+
+04:15.400 --> 00:04:17.133
+And they were both trying to be really smart
+
+00:04:17.133 --> 00:04:19.780
+about which article to put in front of a vowel.
+
+04:19.780 --> 00:04:20.960
+And you just don't do that.
+
+04:20.960 --> 00:04:25.000
+You just keep things simple.
+
+04:25.000 --> 00:04:26.633
+Here again, the code is trying to be smart,
+
+00:04:26.633 --> 00:04:28.480
+but it's really not much more efficient than
+
+04:28.480 --> 00:04:34.940
+plainly stating what you want.
+
+04:34.940 --> 00:04:36.500
+And here again, we have "concat" things
+
+00:04:36.500 --> 00:04:40.367
+that we could just use to plainly state
+
+00:04:40.367 --> 00:04:41.980
+what we want to state.
+
+04:41.980 --> 00:04:49.880
+So, instead of "concat" I just put a "message".
+
+04:49.880 --> 00:04:52.260
+And here we have something that's very cute.
+
+04:52.260 --> 00:04:54.540
+It's a computerized plural.
+
+04:54.540 --> 00:04:55.700
+Here again, assuming that
+
+00:04:55.700 --> 00:04:58.640
+there are only plural or singular forms.
+
+04:58.640 --> 00:05:00.867
+But the end string is not that much more natural
+
+00:05:00.867 --> 00:05:02.700
+than the fix, the code is less efficient
+
+00:05:02.700 --> 00:05:07.760
+and is harder to understand.
+
+05:07.760 --> 00:05:09.433
+Here again, the code is trying to make
+
+00:05:09.433 --> 00:05:13.520
+smart things where it could be much simpler.
+
+05:13.520 --> 00:05:14.667
+That is the part where you get the
+
+00:05:14.667 --> 00:05:19.480
+number of packages and their names.
+
+05:19.480 --> 00:05:22.067
+Here the whole sentence with the semicolons
+
+00:05:22.067 --> 00:05:26.333
+and the question mark is split in parts,
+
+00:05:26.333 --> 00:05:29.180
+between which something will be inserted.
+
+05:29.180 --> 00:05:34.240
+That's really ugly and difficult to read.
+
+05:34.240 --> 00:05:37.700
+Here again, another "ing" waiting to be
+
+00:05:37.700 --> 00:05:44.840
+regex-inserted into the code.
+
+05:44.840 --> 00:05:46.633
+And here at last, we get to the point
+
+00:05:46.633 --> 00:05:48.760
+where everything started.
+
+05:48.760 --> 00:05:50.833
+And you can see that unlike in the other spots,
+
+00:05:50.833 --> 00:05:52.400
+there is no possibility for the expression
+
+05:52.400 --> 00:05:54.680
+to be singular.
+
+05:54.680 --> 00:05:57.600
+So, I guess that if it hadn't been for that bug,
+
+00:05:57.600 --> 00:05:59.320
+I would not have found the other items,
+
+05:59.320 --> 00:06:01.033
+and we would be left with code that works,
+
+00:06:01.033 --> 00:06:02.033
+of course, but that is
+
+00:06:02.033 --> 00:06:06.020
+harder to understand, and maintain.
+
+06:06.020 --> 00:06:08.333
+Last but not least, a last version of
+
+00:06:08.333 --> 00:06:10.920
+"just plainly state what you mean to state".
+
+06:10.920 --> 00:06:14.880
+Keep it simple.
+
+NOTE "What did I learn, and how did I learn it?"
+
+06:14.880 --> 00:06:19.267
+So first, we have this wonderful CONTRIBUTE file
+
+00:06:19.267 --> 00:06:21.267
+that is very explicit about
+
+00:06:21.267 --> 00:06:23.520
+how we must proceed when contributing code.
+
+06:23.520 --> 00:06:25.233
+So, that's really the first place
+
+00:06:25.233 --> 00:06:27.760
+that we should all read.
+
+06:27.760 --> 00:06:29.333
+The README file is pretty cool too,
+
+00:06:29.333 --> 00:06:30.967
+especially at the beginning of the process,
+
+00:06:30.967 --> 00:06:31.867
+when you're not sure whether
+
+00:06:31.867 --> 00:06:36.240
+you want to fix that bug or just report it.
+
+NOTE Useful packages
+
+06:36.240 --> 00:06:37.920
+And then we've got packages.
+
+06:37.920 --> 00:06:39.900
+We've got a number of packages that are really
+
+00:06:39.900 --> 00:06:42.600
+helpful when it comes to reading
+
+00:06:42.600 --> 00:06:45.880
+the information and the manuals.
+
+06:45.880 --> 00:06:48.000
+I'm mentioning three of them here,
+
+00:06:48.000 --> 00:06:53.720
+and I think they are the most important for us.
+
+NOTE Package: helpful
+
+06:53.720 --> 00:06:55.600
+So "helpful" is on the right,
+
+00:06:55.600 --> 00:06:58.667
+and it's overflowing the window with
+
+00:06:58.667 --> 00:07:01.900
+all the contextualized information it provides,
+
+00:07:01.900 --> 00:07:05.280
+and the standard "help" is on the left.
+
+07:05.280 --> 00:07:07.933
+I mean, really there are like two or three
+
+00:07:07.933 --> 00:07:11.567
+screen-full of information in the "helpful" output,
+
+00:07:11.567 --> 00:07:13.233
+so you really only see a part,
+
+00:07:13.233 --> 00:07:16.320
+but I guess if you use it, you know what I'm saying.
+
+07:16.320 --> 00:07:18.867
+What I like the most here is the "view in manual"
+
+00:07:18.867 --> 00:07:21.800
+part, where you can actually click and even get
+
+00:07:21.800 --> 00:07:23.667
+more information that's sometimes
+
+00:07:23.667 --> 00:07:28.400
+easier to read and understand.
+
+NOTE Package: inform
+
+07:28.400 --> 00:07:33.640
+And then you've got the "info" versus "inform" formats.
+
+07:33.640 --> 00:07:34.567
+When you're in the manual,
+
+00:07:34.567 --> 00:07:37.140
+"inform" makes a huge difference.
+
+07:37.140 --> 00:07:39.367
+You can see here that you've got colorized items,
+
+00:07:39.367 --> 00:07:42.000
+and also in the middle you've got that
+
+07:42.000 --> 00:07:45.000
+'read' part that's green and bold.
+
+07:45.000 --> 00:07:49.333
+In "info" it's not a specific object,
+
+00:07:49.333 --> 00:07:52.200
+it's just a string. In 'inform' it's actually
+
+00:07:52.200 --> 00:07:53.800
+a link that you can click,
+
+00:07:53.800 --> 00:07:58.320
+and actually go to that 'read' manual page.
+
+NOTE Package: which-key
+
+07:58.320 --> 00:08:01.300
+Now, we've got "which-key".
+
+08:01.300 --> 00:08:03.400
+"which-key" is a savior for beginners too.
+
+08:03.400 --> 00:08:04.867
+Just wait half a second or something,
+
+00:08:04.867 --> 00:08:06.500
+and Emacs will show you all the keys
+
+00:08:06.500 --> 00:08:08.433
+that you can access from the prefix combination
+
+00:08:08.433 --> 00:08:09.920
+that you just typed.
+
+08:09.920 --> 00:08:13.200
+So, it's really helpful for discovering functions
+
+00:08:13.200 --> 00:08:19.160
+and learning new functions, getting used to them.
+
+NOTE It all started with this messageā€¦
+
+08:19.160 --> 00:08:21.500
+And so that whole process startedā€¦,
+
+00:08:21.500 --> 00:08:26.533
+it was May 23, 2017,
+
+00:08:26.533 --> 00:08:30.440
+with that thread when I found the bug.
+
+08:30.440 --> 00:08:32.800
+I just bumped into an English/code bug
+
+00:08:32.800 --> 00:08:36.920
+this morning. In package.el, when one package
+
+08:36.920 --> 00:08:39.033
+is not needed anymore, the message is:
+
+00:08:39.033 --> 00:08:41.300
+"Package menu: Operation finished.
+
+00:08:41.300 --> 00:08:44.880
+1 packages are no longer needed", etc.
+
+08:44.880 --> 00:08:49.633
+So, I was asking whether we had best practices
+
+00:08:49.633 --> 00:08:53.800
+for using messages, and we had a whole thread
+
+08:53.800 --> 00:08:57.867
+about that. And while I was discussing on that
+
+00:08:57.867 --> 00:09:01.240
+thread, I started that new thread, which is:
+
+09:01.240 --> 00:09:02.867
+"package.el strings".
+
+00:09:02.867 --> 00:09:09.900
+The whole thing actually ended on June 27, 2018.
+
+00:09:09.900 --> 00:09:15.400
+So, a year after, with that message from Noam
+
+00:09:15.400 --> 00:09:18.567
+telling me that "Yes I can close the bug,"
+
+00:09:18.567 --> 00:09:22.040
+and that was it.
+
+09:22.040 --> 00:09:24.000
+So, it took about a year to finish that.
+
+00:09:24.000 --> 00:09:28.133
+What I did learn basically is that
+
+00:09:28.133 --> 00:09:32.160
+helping with Emacs is not that difficult.
+
+09:32.160 --> 00:09:36.100
+It takes time when you're not fluent with the code,
+
+00:09:36.100 --> 00:09:37.100
+but that's okay because the reference
+
+09:37.100 --> 00:09:39.300
+is excellent, and there are lots of people
+
+00:09:39.300 --> 00:09:41.520
+who are here to help.
+
+NOTE Conclusion
+
+09:41.520 --> 00:09:45.700
+Basically, the solution to all our problems is
+
+00:09:45.700 --> 00:09:47.733
+"Keep It Simple and Straightforward".
+
+00:09:47.733 --> 00:09:51.033
+As you can see in that patch,
+
+00:09:51.033 --> 00:09:53.233
+even if it's a beginner's patch,
+
+00:09:53.233 --> 00:09:57.733
+what I did shows what can be done by Emacs Lisp
+
+00:09:57.733 --> 00:09:59.533
+beginners to help with "straightening" the strings
+
+00:09:59.533 --> 00:10:02.267
+to reduce the number of potential English bugs.
+
+00:10:02.267 --> 00:10:04.533
+And then to make Emacs strings easier
+
+00:10:04.533 --> 00:10:07.233
+to be handled by real localization processes one day.
+
+00:10:07.233 --> 00:10:09.067
+But it doesn't have to be about strings
+
+00:10:09.067 --> 00:10:12.767
+because strings can be an easy entry point to Emacs,
+
+00:10:12.767 --> 00:10:16.720
+but it can be any itch that you want to scratch.
+
+10:16.720 --> 00:10:18.267
+And my real conclusion is that
+
+00:10:18.267 --> 00:10:22.160
+Emacs is free software, and what that means is mostly
+
+10:22.160 --> 00:10:24.067
+that it allows you to do things that you would
+
+00:10:24.067 --> 00:10:27.920
+never have thought of being able to do before.
+
+10:27.920 --> 00:10:32.000
+That's really the biggest lesson to be learned here.
+
+10:32.000 --> 00:10:33.400
+So, I want to thank all the people
+
+00:10:33.400 --> 00:10:37.920
+who allowed this to be happening, allowed me to
+
+10:37.920 --> 00:10:41.267
+learn a bit and contribute a bit to that wonderful
+
+00:10:41.267 --> 00:10:42.800
+piece of software that Emacs is.
+
+00:10:42.800 --> 00:10:44.533
+And thank you everyone for listening,
+
+00:10:44.533 --> 00:10:46.700
+and hopefully I'll see you next year
+
+00:10:46.700 --> 00:10:51.520
+with a different translation related presentation.
+
+10:51.520 --> 11:13.640
+Thank you very much.