WEBVTT captioned by sachac
NOTE Introduction
00:00:00.000 --> 00:00:09.359
Can you believe it's been a decade since I started
00:00:09.360 --> 00:00:12.358
pontificating on literate programming?
00:00:12.359 --> 00:00:17.542
I am Howard Abrams. In 2015, I spoke at this EmacsConf
00:00:17.543 --> 00:00:21.705
where I described my challenges I called Literate DevOps.
00:00:21.706 --> 00:00:25.634
The conference wasn't completely virtual, even though I was.
00:00:25.635 --> 00:00:29.317
My city of Portland was suffering a citywide electrical outage
00:00:29.318 --> 00:00:33.479
and I was without power, so I gave the talk in a corner of my
00:00:33.480 --> 00:00:37.439
friend's living room. People online asking questions and
00:00:37.440 --> 00:00:41.439
wondering about literate programming... I also see comments
00:00:41.440 --> 00:00:44.599
explaining why literate programming hasn't caught on in
00:00:44.600 --> 00:00:49.079
corporate practice. I often don't engage. I mean, is the
00:00:49.080 --> 00:00:51.599
online arguments and chatter over ignorance or
00:00:51.600 --> 00:00:56.719
preference? Sure, we're wired differently. I mean, my
00:00:56.720 --> 00:00:59.559
favorite programming languages put the parentheses
00:00:59.560 --> 00:01:01.939
before the function name.
00:01:01.940 --> 00:01:03.800
Literate programming has come a long way
00:01:03.801 --> 00:01:08.519
since Knuth proposed it in the 19th century. I feel
00:01:08.520 --> 00:01:12.999
it's come a long way just in the last 10 years. Obviously,
00:01:13.000 --> 00:01:16.399
this interest is due to Org. I don't think I would bother if
00:01:16.400 --> 00:01:21.359
all I had was Knuth's original preprocessor. But since I'm
00:01:21.360 --> 00:01:24.839
talking to fellow nerds about an open source project
00:01:24.840 --> 00:01:27.919
without corporate backing, let me change the title of my
00:01:27.920 --> 00:01:32.919
talk and re-pitch Literate Programming in the 24th and a
00:01:32.920 --> 00:01:35.252
Half Century!
NOTE Do I still literate?
00:01:35.253 --> 00:01:36.653
People often ask if I still program that way.
00:01:36.654 --> 00:01:42.759
I guess they want to know if there's any long-term benefits,
00:01:42.760 --> 00:01:45.919
for many of our tools and our workflows, while initially
00:01:45.920 --> 00:01:51.079
tantalizing, often don't last. But yes, when I sit down to
00:01:51.080 --> 00:01:57.759
write a program, I create a file with an extension of .org.
00:01:57.760 --> 00:02:03.799
I guess you can say I program literally.
00:02:03.800 --> 00:02:07.359
Let me be transparent. Do I use literate programming during
00:02:07.360 --> 00:02:12.599
my day job? Yes, but only for personal tools or for initial
00:02:12.600 --> 00:02:16.759
investigation. At the end of the sprint, I tangle the file
00:02:16.760 --> 00:02:21.079
and git commit that. My personal projects, on the other
00:02:21.080 --> 00:02:25.679
hand, are Org files. Since I can't show you the code from
00:02:25.680 --> 00:02:27.839
my day job, I'm afraid my example code will have a lot of
00:02:27.840 --> 00:02:31.159
parentheses.
00:02:31.160 --> 00:02:33.955
I'm sure you won't mind.
00:02:33.956 --> 00:02:37.356
I like having my Emacs configuration in Org.
00:02:37.357 --> 00:02:40.359
It's pretty bling. It has over 8,000
00:02:40.360 --> 00:02:44.559
lines of code. I know, I can hear the screams and gasps over
00:02:44.560 --> 00:02:49.439
the network. However, the surrounding prose in Org adds
00:02:49.440 --> 00:02:53.410
10,000 lines, and those lines are non-wrapped paragraphs.
00:02:53.411 --> 00:02:58.119
I mean, is that large? Sure, we've all worked on
00:02:58.120 --> 00:03:03.639
larger, so I guess it's not huge. Come on, it's still
00:03:03.640 --> 00:03:06.331
significant.
NOTE Advantages
00:03:06.332 --> 00:03:09.799
Advantages? Look who I'm talking to. I'm sure
00:03:09.800 --> 00:03:14.279
you know the advantages, but indulge me. I feel that one
00:03:14.280 --> 00:03:16.799
advantage of literate programming, especially with large
00:03:16.800 --> 00:03:20.279
code bases, is how you can organize and manage the
00:03:20.280 --> 00:03:24.839
complexity. Most programming languages tame large bases
00:03:24.840 --> 00:03:29.119
by putting code in separate files. While Org can too, with
00:03:29.120 --> 00:03:32.279
Org, we can group related functions together under
00:03:32.280 --> 00:03:35.043
expandable headlines.
00:03:35.044 --> 00:03:37.279
Here's one. You can see that
00:03:37.280 --> 00:03:40.706
I've got different sections grouped together.
00:03:40.707 --> 00:03:43.759
In my original talk, I mentioned how I would attempt to organize
00:03:43.760 --> 00:03:47.839
my thoughts before coding. I appreciate how I can look back
00:03:47.840 --> 00:03:53.599
at my notes. In my Emacs configuration, I review the prose to
00:03:53.600 --> 00:03:57.799
help memorize key bindings.
00:03:57.800 --> 00:04:01.039
My section on getting email working with Emacs using
00:04:01.040 --> 00:04:04.079
notmuch means creating small collections of scripts and
00:04:04.080 --> 00:04:08.199
configuration files. I can tangle them all from one Org
00:04:08.200 --> 00:04:16.799
file. I like that I can explain each part separately.
00:04:16.800 --> 00:04:20.879
You just can't beat having links back to Stack Overflow or
00:04:20.880 --> 00:04:25.519
that GitHub repo where you stole, I mean, became inspired to
00:04:25.520 --> 00:04:28.719
write your code.
NOTE Disadvantages
00:04:28.720 --> 00:04:34.279
Literate programming may push the boundaries of our
00:04:34.280 --> 00:04:38.119
workflows and revealing some abrasion, but we aren't
00:04:38.120 --> 00:04:41.239
solely working with Org. We have the flexibility of a Lisp
00:04:41.240 --> 00:04:45.119
engine to file down those rough parts. You may have your
00:04:45.120 --> 00:04:48.159
concerns. Perhaps you could reach out to me, and with
00:04:48.160 --> 00:04:54.239
particular issues, maybe we can figure something out.
00:04:54.240 --> 00:04:57.439
Here is my list of frictions, and the rest of my talk
00:04:57.440 --> 00:05:02.159
demonstrates my answers and my hacks. The goal in literate
00:05:02.160 --> 00:05:05.039
programming with Org is that it should not require more
00:05:05.040 --> 00:05:08.679
effort than non-literate programming. For instance, I
00:05:08.680 --> 00:05:12.119
shouldn't have to type much more than regular programming
00:05:12.120 --> 00:05:15.719
to get my code literate. I also shouldn't have to worry about
00:05:15.720 --> 00:05:20.799
the state between my Org file and the source code. I want
00:05:20.800 --> 00:05:24.132
to be able to jump around my code just as easily.
NOTE Ease of typing
00:05:24.133 --> 00:05:28.654
Let me explain more. I've created some templates using
00:05:28.655 --> 00:05:34.679
yasnippet. Since I was used to the old org-tempo feature,
00:05:34.680 --> 00:05:37.145
my habit has all the snippets starting with a
00:05:37.146 --> 00:05:40.759
< character. I'm not sure if I should demonstrate all of them
00:05:40.760 --> 00:05:45.999
as you may be doing something similar. I like to build on top
00:05:46.000 --> 00:05:49.999
of characters to remind me that if I just enter a <s, I
00:05:50.000 --> 00:05:53.519
need to put in the language. But if I append a mnemonic, I can
00:05:53.520 --> 00:05:56.839
get a full language. Why not do that with a full function
00:05:56.840 --> 00:06:01.199
definition? In this case, I'm smooshing one yasnippet
00:06:01.200 --> 00:06:11.679
inside another one in order to save myself some typing.
00:06:11.680 --> 00:06:15.159
My point here is to pay attention to what slows you down or
00:06:15.160 --> 00:06:24.719
hinders you from getting the advantages you want.
NOTE Keep tangled code sync'd
00:06:24.720 --> 00:06:28.399
Do you ever forget to tangle your code? You can append this
00:06:28.400 --> 00:06:31.519
code to the bottom of your Org file so that it gets tangled
00:06:31.520 --> 00:06:36.159
every time you save. I've written a function so I can visit
00:06:36.160 --> 00:06:40.559
that tangled file and then return. I've grouped all my
00:06:40.560 --> 00:06:45.119
functions together. I've taken a cue from Charles Choi, you
00:06:45.120 --> 00:06:48.639
know, kickingvegas, and his Casual feature set. But
00:06:48.640 --> 00:06:52.374
instead of Transient, I've just made a hydra using
00:06:52.375 --> 00:06:57.399
the major-mode-hydra package. Anyway, this allows me to use and
00:06:57.400 --> 00:07:00.136
remember my micro-optimizations.
00:07:00.137 --> 00:07:03.697
If you set the :comments property to link,
00:07:03.698 --> 00:07:06.999
the tangled output is back-connected.
00:07:07.000 --> 00:07:11.479
This allows us to edit the tangled code and have it update the
00:07:11.480 --> 00:07:16.879
Org file. Personally, I don't like this. My source of truth
00:07:16.880 --> 00:07:22.500
is the Org file, and I tangle as a one-way diode.
NOTE Code evaluation
00:07:22.501 --> 00:07:25.603
Often a block of code will reference a variable
00:07:25.604 --> 00:07:29.046
or call a function to find in another block of code.
00:07:29.047 --> 00:07:31.508
In my original literate DevOps talk,
00:07:31.509 --> 00:07:34.519
I discussed how to use the output from one block into
00:07:34.520 --> 00:07:37.799
another block by naming the first block and referencing it
00:07:37.800 --> 00:07:42.159
with a :var for the second. However, if all the blocks use the
00:07:42.160 --> 00:07:46.039
same language, you can use sessions, which create a
00:07:46.040 --> 00:07:51.479
persistent REPL behind the scenes. Let's evaluate the
00:07:51.480 --> 00:07:53.199
blocks of Python code in this file.
00:07:53.200 --> 00:08:00.119
The evaluation created a Python REPL. It's available in
00:08:00.120 --> 00:08:04.279
another buffer. This buffer matches the name of the
00:08:04.280 --> 00:08:07.959
session, but with surrounding asterisks. Evaluating a
00:08:07.960 --> 00:08:11.399
code block sends it into the REPL, and now I can work with my
00:08:11.400 --> 00:08:19.959
code blocks interactively. (That's not quite right.)
NOTE Has that block been eval'd?
00:08:19.960 --> 00:08:24.039
I primarily hack on Emacs Lisp, and textual changes to
00:08:24.040 --> 00:08:28.199
variables, functions, or macros--unless you habitually
00:08:28.200 --> 00:08:31.679
type C-c C-c--may not represent the state of your
00:08:31.680 --> 00:08:35.439
machine. A similar effect happens in any language that
00:08:35.440 --> 00:08:39.319
uses sessions. Sure, I can move the point to a block and
00:08:39.320 --> 00:08:42.799
evaluate, but I have three functions that allow me to
00:08:42.800 --> 00:08:44.734
evaluate all blocks in a buffer or all blocks in a subtree,
00:08:44.735 --> 00:08:50.199
or I can, without moving the point, evaluate any block I see.
00:08:50.200 --> 00:08:54.919
Now, this function here evaluates all blocks in a buffer.
00:08:54.920 --> 00:08:58.279
Someone mentioned calling this function when you first
00:08:58.280 --> 00:09:02.359
load a file. I'm not sure that's a good policy. I mean, have
00:09:02.360 --> 00:09:05.238
you not written a bug?
NOTE Evaluating code in a subtree
00:09:05.239 --> 00:09:08.559
Since this function right here
00:09:08.560 --> 00:09:12.039
evaluates only visible blocks, we can limit what Emacs
00:09:12.040 --> 00:09:18.799
evaluates to a single Org mode section. For instance, with
00:09:18.800 --> 00:09:23.759
the cursor in one section, I can evaluate just the blocks in
00:09:23.760 --> 00:09:26.871
that header section.
NOTE Evaluating code from a distance
00:09:26.872 --> 00:09:29.399
If I can see a block, why clumsily
00:09:29.400 --> 00:09:33.079
navigate to it when I can extend the avy project to just jump to
00:09:33.080 --> 00:09:40.479
it? For instance, let's pull this file up. I can jump to any of
00:09:40.480 --> 00:09:41.639
the four blocks.
00:09:41.640 --> 00:09:50.319
I think that's quite slick. Now why navigate to a code block
00:09:50.320 --> 00:09:55.799
solely to evaluate it? Yes, this is a terrible example, but
00:09:55.800 --> 00:09:59.679
these three blocks set a variable to different values. So
00:09:59.680 --> 00:10:02.599
without moving the point, I can evaluate any one of them.
00:10:02.600 --> 00:10:09.719
To be honest, the reason why I wrote this is because I often
00:10:09.720 --> 00:10:13.999
forget to evaluate a block after editing it. I've moved on,
00:10:14.000 --> 00:10:17.839
and I just don't want to jump back. Now, I can just evaluate
00:10:17.840 --> 00:10:22.359
from a distance. I apologize for the previous terrible
00:10:22.360 --> 00:10:26.019
examples, but I'm quite pleased with this feature.
NOTE Navigating by headers
00:10:26.020 --> 00:10:30.119
As I mentioned earlier, in a large code base, we organize code by
00:10:30.120 --> 00:10:33.839
library or module, and each file contains a class composed
00:10:33.840 --> 00:10:37.119
of methods, functions, variables, fields, et cetera.
00:10:37.120 --> 00:10:39.999
Literate programming in Org files allows me to add a
00:10:40.000 --> 00:10:43.159
semantic organization layer where I can group related
00:10:43.160 --> 00:10:46.919
concepts under headlines. Now, while this isn't specific
00:10:46.920 --> 00:10:50.799
to literate programming, I wrote a little user interface to
00:10:50.800 --> 00:10:54.296
allow me to jump to any heading in any Org file
00:10:54.297 --> 00:10:57.679
in a particular project.
00:10:57.680 --> 00:11:02.879
These are the headings in my Emacs configuration project.
00:11:02.880 --> 00:11:06.559
Notice the file name beforehand, before the colon
00:11:06.560 --> 00:11:09.759
character. The header name and its parent headers are
00:11:09.760 --> 00:11:14.799
after. Let me search for the LSP sections. Maybe I only want
00:11:14.800 --> 00:11:20.039
the one for Python. Now I use ripgrep to search the files and
00:11:20.040 --> 00:11:24.559
then some Lisp to parse the output. Unless someone has
00:11:24.560 --> 00:11:26.793
already done this, I should package this up on MELPA.
NOTE Navigating by function names
00:11:26.794 --> 00:11:32.199
What about jumping directly to the definition of a function,
00:11:32.200 --> 00:11:36.799
variable, or what have you? We can use Emacs's built-in xref
00:11:36.800 --> 00:11:39.879
library, but these functions don't understand that the
00:11:39.880 --> 00:11:45.319
source code is in Org files. When I started using Emacs
00:11:45.320 --> 00:11:49.479
30-something years ago, I would pre-index my source into
00:11:49.480 --> 00:11:53.799
tag files, but the dumb-jump project uses the newfangled and
00:11:53.800 --> 00:11:58.319
faster text search programs like ripgrep to find a symbol in
00:11:58.320 --> 00:12:02.319
real time. I followed this pattern and wrote an extension
00:12:02.320 --> 00:12:08.119
to the xref API. Now, I want to jump around my code from both
00:12:08.120 --> 00:12:14.519
code block or in the surrounding prose. I'm sure it
00:12:14.520 --> 00:12:18.199
comes as no surprise that my presentation is just an Org
00:12:18.200 --> 00:12:23.919
file. Let's suppose my cursor is on this symbol. I wrote this
00:12:23.920 --> 00:12:28.079
function for this demonstration. We can jump to the
00:12:28.080 --> 00:12:30.759
definition and I can jump back.
00:12:30.760 --> 00:12:37.639
Notice it jumped into an Org file and back out. References,
00:12:37.640 --> 00:12:42.279
unlike definitions, is where something is defined and
00:12:42.280 --> 00:12:46.919
where it's used. Well, you know how the xref system works.
00:12:46.920 --> 00:12:52.679
Here, I can jump to the definition or where it's
00:12:52.680 --> 00:12:59.519
used. Of course, and jump back. I think this is cool. This
00:12:59.520 --> 00:13:04.319
should be a nifty package on MELPA. But my code is specific to
00:13:04.320 --> 00:13:08.799
Lisp, and I'm not completely sure how to make it general. For
00:13:08.800 --> 00:13:13.399
instance, what is a symbol? If you know the language, this is
00:13:13.400 --> 00:13:17.679
obvious. But what should the language be when your cursor is
00:13:17.680 --> 00:13:22.639
in the prose of an Org file? Python only supports sequences
00:13:22.640 --> 00:13:25.559
of alphanumeric and underscores, but in Lisp, a symbol can
00:13:25.560 --> 00:13:30.399
be almost any character sequence. I've been stewing on how
00:13:30.400 --> 00:13:34.479
to do this. I have ideas like prompting during the first
00:13:34.480 --> 00:13:37.719
query or scanning the language based on the nearest code
00:13:37.720 --> 00:13:40.479
block. I think I'm babbling.
NOTE Why literate programming?
00:13:40.480 --> 00:13:47.199
In true geek fashion, I dived into the details before
00:13:47.200 --> 00:13:52.079
answering some better questions. In my original Literate
00:13:52.080 --> 00:13:55.479
DevOps talk, I explained the advantages of initially
00:13:55.480 --> 00:13:58.959
writing down your thoughts, your plans, goals... the
00:13:58.960 --> 00:14:02.879
user requirements. But what do you do with all that luscious
00:14:02.880 --> 00:14:06.359
prose afterwards? Well, you do the same thing you do to your
00:14:06.360 --> 00:14:09.279
initial code. You refactor that prose.
00:14:09.280 --> 00:14:14.759
Just because the tech surrounding your code is now a
00:14:14.760 --> 00:14:18.799
first-class citizen doesn't excuse bad code. You want
00:14:18.800 --> 00:14:23.165
something more from both your code and your prose.
NOTE LP prose isn't comments
00:14:23.166 --> 00:14:25.586
The prose of your literate program isn't
00:14:25.587 --> 00:14:28.667
just regurgitation of the code in the block.
00:14:28.668 --> 00:14:31.527
You want something more helpful.
00:14:31.528 --> 00:14:35.736
You're really writing a research paper to yourself.
00:14:35.737 --> 00:14:38.577
I know what you're thinking. You've seen my Git repos.
00:14:38.578 --> 00:14:41.858
I'm guilty and not always the best example.
00:14:41.859 --> 00:14:44.559
However, I do get great joy
00:14:44.560 --> 00:14:48.680
when I see someone ask about something in Emacs
00:14:48.681 --> 00:14:51.041
and my response is little more than a link
00:14:51.042 --> 00:14:55.799
to my online repo that I've rendered as a website.
NOTE Summary
00:14:55.800 --> 00:15:01.199
I'm out of time. I hope this has been interesting
00:15:01.200 --> 00:15:04.359
philosophically as well as practically, as I think
00:15:04.360 --> 00:15:08.559
literate programming is the cat's meow. I'm afraid this
00:15:08.560 --> 00:15:11.879
summary slide is about my home-baked solutions that fit my
00:15:11.880 --> 00:15:15.119
needs, but hopefully you can recognize your pain points and
00:15:15.120 --> 00:15:17.839
address them. If you don't need my Literate
00:15:17.840 --> 00:15:21.479
DevOps-specific techniques for connecting code blocks, I
00:15:21.480 --> 00:15:25.799
suggest using sessions by default. I highly recommend
00:15:25.800 --> 00:15:28.399
looking at your workflow and writing snippets to give you
00:15:28.400 --> 00:15:33.159
less typing for Org blocks. I now jump by headlines in my
00:15:33.160 --> 00:15:37.479
projects, but extending xref to support Org files made
00:15:37.480 --> 00:15:40.159
literate programming as easy as programming the
00:15:40.160 --> 00:15:44.319
old-fashioned way. I do need to make it more general to put up
00:15:44.320 --> 00:15:47.722
on MELPA, though. Thanks for watching.
00:15:47.723 --> 00:15:51.240
Happy hacking, my friends.