WEBVTT captioned by sachac NOTE Introduction 00:00:00.000 --> 00:00:09.359 Can you believe it's been a decade since I started 00:00:09.360 --> 00:00:12.358 pontificating on literate programming? 00:00:12.359 --> 00:00:17.542 I am Howard Abrams. In 2015, I spoke at this EmacsConf 00:00:17.543 --> 00:00:21.705 where I described my challenges I called Literate DevOps. 00:00:21.706 --> 00:00:25.634 The conference wasn't completely virtual, even though I was. 00:00:25.635 --> 00:00:29.317 My city of Portland was suffering a citywide electrical outage 00:00:29.318 --> 00:00:33.479 and I was without power, so I gave the talk in a corner of my 00:00:33.480 --> 00:00:37.439 friend's living room. People online asking questions and 00:00:37.440 --> 00:00:41.439 wondering about literate programming... I also see comments 00:00:41.440 --> 00:00:44.599 explaining why literate programming hasn't caught on in 00:00:44.600 --> 00:00:49.079 corporate practice. I often don't engage. I mean, is the 00:00:49.080 --> 00:00:51.599 online arguments and chatter over ignorance or 00:00:51.600 --> 00:00:56.719 preference? Sure, we're wired differently. I mean, my 00:00:56.720 --> 00:00:59.559 favorite programming languages put the parentheses 00:00:59.560 --> 00:01:01.939 before the function name. 00:01:01.940 --> 00:01:03.800 Literate programming has come a long way 00:01:03.801 --> 00:01:08.519 since Knuth proposed it in the 19th century. I feel 00:01:08.520 --> 00:01:12.999 it's come a long way just in the last 10 years. Obviously, 00:01:13.000 --> 00:01:16.399 this interest is due to Org. I don't think I would bother if 00:01:16.400 --> 00:01:21.359 all I had was Knuth's original preprocessor. But since I'm 00:01:21.360 --> 00:01:24.839 talking to fellow nerds about an open source project 00:01:24.840 --> 00:01:27.919 without corporate backing, let me change the title of my 00:01:27.920 --> 00:01:32.919 talk and re-pitch Literate Programming in the 24th and a 00:01:32.920 --> 00:01:35.252 Half Century! NOTE Do I still literate? 00:01:35.253 --> 00:01:36.653 People often ask if I still program that way. 00:01:36.654 --> 00:01:42.759 I guess they want to know if there's any long-term benefits, 00:01:42.760 --> 00:01:45.919 for many of our tools and our workflows, while initially 00:01:45.920 --> 00:01:51.079 tantalizing, often don't last. But yes, when I sit down to 00:01:51.080 --> 00:01:57.759 write a program, I create a file with an extension of .org. 00:01:57.760 --> 00:02:03.799 I guess you can say I program literally. 00:02:03.800 --> 00:02:07.359 Let me be transparent. Do I use literate programming during 00:02:07.360 --> 00:02:12.599 my day job? Yes, but only for personal tools or for initial 00:02:12.600 --> 00:02:16.759 investigation. At the end of the sprint, I tangle the file 00:02:16.760 --> 00:02:21.079 and git commit that. My personal projects, on the other 00:02:21.080 --> 00:02:25.679 hand, are Org files. Since I can't show you the code from 00:02:25.680 --> 00:02:27.839 my day job, I'm afraid my example code will have a lot of 00:02:27.840 --> 00:02:31.159 parentheses. 00:02:31.160 --> 00:02:33.955 I'm sure you won't mind. 00:02:33.956 --> 00:02:37.356 I like having my Emacs configuration in Org. 00:02:37.357 --> 00:02:40.359 It's pretty bling. It has over 8,000 00:02:40.360 --> 00:02:44.559 lines of code. I know, I can hear the screams and gasps over 00:02:44.560 --> 00:02:49.439 the network. However, the surrounding prose in Org adds 00:02:49.440 --> 00:02:53.410 10,000 lines, and those lines are non-wrapped paragraphs. 00:02:53.411 --> 00:02:58.119 I mean, is that large? Sure, we've all worked on 00:02:58.120 --> 00:03:03.639 larger, so I guess it's not huge. Come on, it's still 00:03:03.640 --> 00:03:06.331 significant. NOTE Advantages 00:03:06.332 --> 00:03:09.799 Advantages? Look who I'm talking to. I'm sure 00:03:09.800 --> 00:03:14.279 you know the advantages, but indulge me. I feel that one 00:03:14.280 --> 00:03:16.799 advantage of literate programming, especially with large 00:03:16.800 --> 00:03:20.279 code bases, is how you can organize and manage the 00:03:20.280 --> 00:03:24.839 complexity. Most programming languages tame large bases 00:03:24.840 --> 00:03:29.119 by putting code in separate files. While Org can too, with 00:03:29.120 --> 00:03:32.279 Org, we can group related functions together under 00:03:32.280 --> 00:03:35.043 expandable headlines. 00:03:35.044 --> 00:03:37.279 Here's one. You can see that 00:03:37.280 --> 00:03:40.706 I've got different sections grouped together. 00:03:40.707 --> 00:03:43.759 In my original talk, I mentioned how I would attempt to organize 00:03:43.760 --> 00:03:47.839 my thoughts before coding. I appreciate how I can look back 00:03:47.840 --> 00:03:53.599 at my notes. In my Emacs configuration, I review the prose to 00:03:53.600 --> 00:03:57.799 help memorize key bindings. 00:03:57.800 --> 00:04:01.039 My section on getting email working with Emacs using 00:04:01.040 --> 00:04:04.079 notmuch means creating small collections of scripts and 00:04:04.080 --> 00:04:08.199 configuration files. I can tangle them all from one Org 00:04:08.200 --> 00:04:16.799 file. I like that I can explain each part separately. 00:04:16.800 --> 00:04:20.879 You just can't beat having links back to Stack Overflow or 00:04:20.880 --> 00:04:25.519 that GitHub repo where you stole, I mean, became inspired to 00:04:25.520 --> 00:04:28.719 write your code. NOTE Disadvantages 00:04:28.720 --> 00:04:34.279 Literate programming may push the boundaries of our 00:04:34.280 --> 00:04:38.119 workflows and revealing some abrasion, but we aren't 00:04:38.120 --> 00:04:41.239 solely working with Org. We have the flexibility of a Lisp 00:04:41.240 --> 00:04:45.119 engine to file down those rough parts. You may have your 00:04:45.120 --> 00:04:48.159 concerns. Perhaps you could reach out to me, and with 00:04:48.160 --> 00:04:54.239 particular issues, maybe we can figure something out. 00:04:54.240 --> 00:04:57.439 Here is my list of frictions, and the rest of my talk 00:04:57.440 --> 00:05:02.159 demonstrates my answers and my hacks. The goal in literate 00:05:02.160 --> 00:05:05.039 programming with Org is that it should not require more 00:05:05.040 --> 00:05:08.679 effort than non-literate programming. For instance, I 00:05:08.680 --> 00:05:12.119 shouldn't have to type much more than regular programming 00:05:12.120 --> 00:05:15.719 to get my code literate. I also shouldn't have to worry about 00:05:15.720 --> 00:05:20.799 the state between my Org file and the source code. I want 00:05:20.800 --> 00:05:24.132 to be able to jump around my code just as easily. NOTE Ease of typing 00:05:24.133 --> 00:05:28.654 Let me explain more. I've created some templates using 00:05:28.655 --> 00:05:34.679 yasnippet. Since I was used to the old org-tempo feature, 00:05:34.680 --> 00:05:37.145 my habit has all the snippets starting with a 00:05:37.146 --> 00:05:40.759 < character. I'm not sure if I should demonstrate all of them 00:05:40.760 --> 00:05:45.999 as you may be doing something similar. I like to build on top 00:05:46.000 --> 00:05:49.999 of characters to remind me that if I just enter a 00:05:53.519 need to put in the language. But if I append a mnemonic, I can 00:05:53.520 --> 00:05:56.839 get a full language. Why not do that with a full function 00:05:56.840 --> 00:06:01.199 definition? In this case, I'm smooshing one yasnippet 00:06:01.200 --> 00:06:11.679 inside another one in order to save myself some typing. 00:06:11.680 --> 00:06:15.159 My point here is to pay attention to what slows you down or 00:06:15.160 --> 00:06:24.719 hinders you from getting the advantages you want. NOTE Keep tangled code sync'd 00:06:24.720 --> 00:06:28.399 Do you ever forget to tangle your code? You can append this 00:06:28.400 --> 00:06:31.519 code to the bottom of your Org file so that it gets tangled 00:06:31.520 --> 00:06:36.159 every time you save. I've written a function so I can visit 00:06:36.160 --> 00:06:40.559 that tangled file and then return. I've grouped all my 00:06:40.560 --> 00:06:45.119 functions together. I've taken a cue from Charles Choi, you 00:06:45.120 --> 00:06:48.639 know, kickingvegas, and his Casual feature set. But 00:06:48.640 --> 00:06:52.374 instead of Transient, I've just made a hydra using 00:06:52.375 --> 00:06:57.399 the major-mode-hydra package. Anyway, this allows me to use and 00:06:57.400 --> 00:07:00.136 remember my micro-optimizations. 00:07:00.137 --> 00:07:03.697 If you set the :comments property to link, 00:07:03.698 --> 00:07:06.999 the tangled output is back-connected. 00:07:07.000 --> 00:07:11.479 This allows us to edit the tangled code and have it update the 00:07:11.480 --> 00:07:16.879 Org file. Personally, I don't like this. My source of truth 00:07:16.880 --> 00:07:22.500 is the Org file, and I tangle as a one-way diode. NOTE Code evaluation 00:07:22.501 --> 00:07:25.603 Often a block of code will reference a variable 00:07:25.604 --> 00:07:29.046 or call a function to find in another block of code. 00:07:29.047 --> 00:07:31.508 In my original literate DevOps talk, 00:07:31.509 --> 00:07:34.519 I discussed how to use the output from one block into 00:07:34.520 --> 00:07:37.799 another block by naming the first block and referencing it 00:07:37.800 --> 00:07:42.159 with a :var for the second. However, if all the blocks use the 00:07:42.160 --> 00:07:46.039 same language, you can use sessions, which create a 00:07:46.040 --> 00:07:51.479 persistent REPL behind the scenes. Let's evaluate the 00:07:51.480 --> 00:07:53.199 blocks of Python code in this file. 00:07:53.200 --> 00:08:00.119 The evaluation created a Python REPL. It's available in 00:08:00.120 --> 00:08:04.279 another buffer. This buffer matches the name of the 00:08:04.280 --> 00:08:07.959 session, but with surrounding asterisks. Evaluating a 00:08:07.960 --> 00:08:11.399 code block sends it into the REPL, and now I can work with my 00:08:11.400 --> 00:08:19.959 code blocks interactively. (That's not quite right.) NOTE Has that block been eval'd? 00:08:19.960 --> 00:08:24.039 I primarily hack on Emacs Lisp, and textual changes to 00:08:24.040 --> 00:08:28.199 variables, functions, or macros--unless you habitually 00:08:28.200 --> 00:08:31.679 type C-c C-c--may not represent the state of your 00:08:31.680 --> 00:08:35.439 machine. A similar effect happens in any language that 00:08:35.440 --> 00:08:39.319 uses sessions. Sure, I can move the point to a block and 00:08:39.320 --> 00:08:42.799 evaluate, but I have three functions that allow me to 00:08:42.800 --> 00:08:44.734 evaluate all blocks in a buffer or all blocks in a subtree, 00:08:44.735 --> 00:08:50.199 or I can, without moving the point, evaluate any block I see. 00:08:50.200 --> 00:08:54.919 Now, this function here evaluates all blocks in a buffer. 00:08:54.920 --> 00:08:58.279 Someone mentioned calling this function when you first 00:08:58.280 --> 00:09:02.359 load a file. I'm not sure that's a good policy. I mean, have 00:09:02.360 --> 00:09:05.238 you not written a bug? NOTE Evaluating code in a subtree 00:09:05.239 --> 00:09:08.559 Since this function right here 00:09:08.560 --> 00:09:12.039 evaluates only visible blocks, we can limit what Emacs 00:09:12.040 --> 00:09:18.799 evaluates to a single Org mode section. For instance, with 00:09:18.800 --> 00:09:23.759 the cursor in one section, I can evaluate just the blocks in 00:09:23.760 --> 00:09:26.871 that header section. NOTE Evaluating code from a distance 00:09:26.872 --> 00:09:29.399 If I can see a block, why clumsily 00:09:29.400 --> 00:09:33.079 navigate to it when I can extend the avy project to just jump to 00:09:33.080 --> 00:09:40.479 it? For instance, let's pull this file up. I can jump to any of 00:09:40.480 --> 00:09:41.639 the four blocks. 00:09:41.640 --> 00:09:50.319 I think that's quite slick. Now why navigate to a code block 00:09:50.320 --> 00:09:55.799 solely to evaluate it? Yes, this is a terrible example, but 00:09:55.800 --> 00:09:59.679 these three blocks set a variable to different values. So 00:09:59.680 --> 00:10:02.599 without moving the point, I can evaluate any one of them. 00:10:02.600 --> 00:10:09.719 To be honest, the reason why I wrote this is because I often 00:10:09.720 --> 00:10:13.999 forget to evaluate a block after editing it. I've moved on, 00:10:14.000 --> 00:10:17.839 and I just don't want to jump back. Now, I can just evaluate 00:10:17.840 --> 00:10:22.359 from a distance. I apologize for the previous terrible 00:10:22.360 --> 00:10:26.019 examples, but I'm quite pleased with this feature. NOTE Navigating by headers 00:10:26.020 --> 00:10:30.119 As I mentioned earlier, in a large code base, we organize code by 00:10:30.120 --> 00:10:33.839 library or module, and each file contains a class composed 00:10:33.840 --> 00:10:37.119 of methods, functions, variables, fields, et cetera. 00:10:37.120 --> 00:10:39.999 Literate programming in Org files allows me to add a 00:10:40.000 --> 00:10:43.159 semantic organization layer where I can group related 00:10:43.160 --> 00:10:46.919 concepts under headlines. Now, while this isn't specific 00:10:46.920 --> 00:10:50.799 to literate programming, I wrote a little user interface to 00:10:50.800 --> 00:10:54.296 allow me to jump to any heading in any Org file 00:10:54.297 --> 00:10:57.679 in a particular project. 00:10:57.680 --> 00:11:02.879 These are the headings in my Emacs configuration project. 00:11:02.880 --> 00:11:06.559 Notice the file name beforehand, before the colon 00:11:06.560 --> 00:11:09.759 character. The header name and its parent headers are 00:11:09.760 --> 00:11:14.799 after. Let me search for the LSP sections. Maybe I only want 00:11:14.800 --> 00:11:20.039 the one for Python. Now I use ripgrep to search the files and 00:11:20.040 --> 00:11:24.559 then some Lisp to parse the output. Unless someone has 00:11:24.560 --> 00:11:26.793 already done this, I should package this up on MELPA. NOTE Navigating by function names 00:11:26.794 --> 00:11:32.199 What about jumping directly to the definition of a function, 00:11:32.200 --> 00:11:36.799 variable, or what have you? We can use Emacs's built-in xref 00:11:36.800 --> 00:11:39.879 library, but these functions don't understand that the 00:11:39.880 --> 00:11:45.319 source code is in Org files. When I started using Emacs 00:11:45.320 --> 00:11:49.479 30-something years ago, I would pre-index my source into 00:11:49.480 --> 00:11:53.799 tag files, but the dumb-jump project uses the newfangled and 00:11:53.800 --> 00:11:58.319 faster text search programs like ripgrep to find a symbol in 00:11:58.320 --> 00:12:02.319 real time. I followed this pattern and wrote an extension 00:12:02.320 --> 00:12:08.119 to the xref API. Now, I want to jump around my code from both 00:12:08.120 --> 00:12:14.519 code block or in the surrounding prose. I'm sure it 00:12:14.520 --> 00:12:18.199 comes as no surprise that my presentation is just an Org 00:12:18.200 --> 00:12:23.919 file. Let's suppose my cursor is on this symbol. I wrote this 00:12:23.920 --> 00:12:28.079 function for this demonstration. We can jump to the 00:12:28.080 --> 00:12:30.759 definition and I can jump back. 00:12:30.760 --> 00:12:37.639 Notice it jumped into an Org file and back out. References, 00:12:37.640 --> 00:12:42.279 unlike definitions, is where something is defined and 00:12:42.280 --> 00:12:46.919 where it's used. Well, you know how the xref system works. 00:12:46.920 --> 00:12:52.679 Here, I can jump to the definition or where it's 00:12:52.680 --> 00:12:59.519 used. Of course, and jump back. I think this is cool. This 00:12:59.520 --> 00:13:04.319 should be a nifty package on MELPA. But my code is specific to 00:13:04.320 --> 00:13:08.799 Lisp, and I'm not completely sure how to make it general. For 00:13:08.800 --> 00:13:13.399 instance, what is a symbol? If you know the language, this is 00:13:13.400 --> 00:13:17.679 obvious. But what should the language be when your cursor is 00:13:17.680 --> 00:13:22.639 in the prose of an Org file? Python only supports sequences 00:13:22.640 --> 00:13:25.559 of alphanumeric and underscores, but in Lisp, a symbol can 00:13:25.560 --> 00:13:30.399 be almost any character sequence. I've been stewing on how 00:13:30.400 --> 00:13:34.479 to do this. I have ideas like prompting during the first 00:13:34.480 --> 00:13:37.719 query or scanning the language based on the nearest code 00:13:37.720 --> 00:13:40.479 block. I think I'm babbling. NOTE Why literate programming? 00:13:40.480 --> 00:13:47.199 In true geek fashion, I dived into the details before 00:13:47.200 --> 00:13:52.079 answering some better questions. In my original Literate 00:13:52.080 --> 00:13:55.479 DevOps talk, I explained the advantages of initially 00:13:55.480 --> 00:13:58.959 writing down your thoughts, your plans, goals... the 00:13:58.960 --> 00:14:02.879 user requirements. But what do you do with all that luscious 00:14:02.880 --> 00:14:06.359 prose afterwards? Well, you do the same thing you do to your 00:14:06.360 --> 00:14:09.279 initial code. You refactor that prose. 00:14:09.280 --> 00:14:14.759 Just because the tech surrounding your code is now a 00:14:14.760 --> 00:14:18.799 first-class citizen doesn't excuse bad code. You want 00:14:18.800 --> 00:14:23.165 something more from both your code and your prose. NOTE LP prose isn't comments 00:14:23.166 --> 00:14:25.586 The prose of your literate program isn't 00:14:25.587 --> 00:14:28.667 just regurgitation of the code in the block. 00:14:28.668 --> 00:14:31.527 You want something more helpful. 00:14:31.528 --> 00:14:35.736 You're really writing a research paper to yourself. 00:14:35.737 --> 00:14:38.577 I know what you're thinking. You've seen my Git repos. 00:14:38.578 --> 00:14:41.858 I'm guilty and not always the best example. 00:14:41.859 --> 00:14:44.559 However, I do get great joy 00:14:44.560 --> 00:14:48.680 when I see someone ask about something in Emacs 00:14:48.681 --> 00:14:51.041 and my response is little more than a link 00:14:51.042 --> 00:14:55.799 to my online repo that I've rendered as a website. NOTE Summary 00:14:55.800 --> 00:15:01.199 I'm out of time. I hope this has been interesting 00:15:01.200 --> 00:15:04.359 philosophically as well as practically, as I think 00:15:04.360 --> 00:15:08.559 literate programming is the cat's meow. I'm afraid this 00:15:08.560 --> 00:15:11.879 summary slide is about my home-baked solutions that fit my 00:15:11.880 --> 00:15:15.119 needs, but hopefully you can recognize your pain points and 00:15:15.120 --> 00:15:17.839 address them. If you don't need my Literate 00:15:17.840 --> 00:15:21.479 DevOps-specific techniques for connecting code blocks, I 00:15:21.480 --> 00:15:25.799 suggest using sessions by default. I highly recommend 00:15:25.800 --> 00:15:28.399 looking at your workflow and writing snippets to give you 00:15:28.400 --> 00:15:33.159 less typing for Org blocks. I now jump by headlines in my 00:15:33.160 --> 00:15:37.479 projects, but extending xref to support Org files made 00:15:37.480 --> 00:15:40.159 literate programming as easy as programming the 00:15:40.160 --> 00:15:44.319 old-fashioned way. I do need to make it more general to put up 00:15:44.320 --> 00:15:47.722 on MELPA, though. Thanks for watching. 00:15:47.723 --> 00:15:51.240 Happy hacking, my friends.