WEBVTT 00:00.000 --> 00:00:01.520 My name is Greg Coladonato, 00:00:01.520 --> 00:00:03.199 and this is my presentation named 00:00:03.199 --> 00:00:04.560 One Effective Computer Science 00:00:04.560 --> 00:00:06.480 Grad Student Workflow. 00:06.480 --> 00:00:07.680 For self-introduction, 00:00:07.680 --> 00:00:09.599 I've been an Emacs user since 1989 00:00:09.599 --> 00:00:11.599 when I was an undergrad in computer science, 00:00:11.599 --> 00:00:13.040 and I'm still an Emacs user 00:00:13.040 --> 00:00:15.280 now I'm getting a master's of science 00:00:15.280 --> 00:00:16.880 in computer science. 00:16.880 --> 00:00:17.760 In my day job, 00:00:17.760 --> 00:00:19.199 I work in product management 00:00:19.199 --> 00:00:20.640 in a Silicon Valley 00:00:20.640 --> 00:00:21.840 computer vision startup, 00:00:21.840 --> 00:00:22.880 and I'm proud to say 00:00:22.880 --> 00:00:25.039 I've been submitting my first PRs 00:00:25.039 --> 00:00:27.038 to open source projects this year. 00:27.039 --> 00:00:29.199 The goals of my workflow are first 00:00:29.199 --> 00:00:30.800 to make my notes easily accessible 00:00:30.800 --> 00:00:33.280 and searchable. 00:33.280 --> 00:00:34.800 Second goal, provide a way for me 00:00:34.800 --> 00:00:36.480 to permanently remember what I learned, 00:00:36.480 --> 00:00:38.879 and thirdly, to enable conceptual linking 00:38.879 --> 00:00:40.480 between related topics and entities. 00:00:40.480 --> 00:00:41.920 I'll give examples of each of these 00:00:41.920 --> 00:00:43.119 as we go along. 00:00:43.120 --> 00:00:45.120 The requirements of my workflow: 00:45.120 --> 00:00:47.920 it needs to be tightly integrated with PDFs, 00:00:47.920 --> 00:00:50.399 as most of the documents I get from grad school 00:00:50.399 --> 00:00:51.440 are in PDF form, 00:00:51.440 --> 00:00:53.760 most of my submissions of work 00:00:53.760 --> 00:00:54.719 are in PDF form, 00:00:54.719 --> 00:00:56.640 and most research papers I have access to 00:00:56.640 --> 00:00:58.399 are in PDF form as well. 00:58.399 --> 00:01:00.320 I want my workflow to be subscription-free. 00:01:00.320 --> 00:01:01.840 I do not want to be locked into 01:01.840 --> 00:01:03.120 paying a subscription 00:01:03.120 --> 00:01:04.799 just to read my own notes. 00:01:04.799 --> 00:01:06.720 It must be future proof. 00:01:06.720 --> 00:01:09.600 I have used note-taking systems in the past 01:09.600 --> 00:01:12.960 that I now no longer have a way to decode, 00:01:12.960 --> 00:01:14.640 so they're locked into some format 00:01:14.640 --> 00:01:17.200 that I can no longer use. 01:17.200 --> 00:01:19.119 I want my notes to be version-controlled, 01:19.119 --> 00:01:20.479 so that if I make a big mistake, 00:01:20.479 --> 00:01:22.080 I can undo and revert 00:01:22.080 --> 00:01:23.840 to a prior good version, 01:23.840 --> 00:01:27.680 and I want my system to use spaced repetition, 00:01:27.680 --> 00:01:29.520 which is an advanced method 00:01:29.520 --> 00:01:31.840 of learning things over time 00:01:31.840 --> 00:01:33.999 so that you don't forget them. 01:34.000 --> 00:01:36.799 The package dependencies, in brief. 00:01:36.799 --> 00:01:38.960 org-mode, org-roam, org-roam-bibtex, 00:01:38.960 --> 00:01:42.719 pdf-tools, org-noter and org-ref. 01:42.720 --> 00:01:45.119 And now let's get on to some demos. 01:45.119 --> 00:01:47.520 Here in my browser window here 00:01:47.520 --> 00:01:49.680 is a lecture in the course 00:01:49.680 --> 00:01:51.840 I'm currently taking on deep learning. 01:51.840 --> 00:01:54.240 It's very nice that the professor 01:54.240 --> 00:01:55.759 provides slides. So this is 00:01:55.759 --> 00:02:00.000 the 54-page PDF file of the slides 00:02:00.000 --> 00:02:02.079 for the lecture. The problem is, 00:02:02.079 --> 00:02:03.200 it's hard to take notes on them. 00:02:03.200 --> 00:02:04.560 It's impossible to take notes on them 00:02:04.560 --> 00:02:05.840 here in this browser, 00:02:05.840 --> 00:02:07.840 as far as I know. So what I've done is 00:02:07.840 --> 00:02:11.440 I've incorporated these slides as a PDF 02:11.440 --> 00:02:12.959 in org-roam, which... 00:02:12.959 --> 00:02:16.640 I will now visit this file 00:02:16.640 --> 00:02:19.120 and you can bring it up alongside the PDF 00:02:19.120 --> 00:02:20.560 I was just looking at here. 00:02:20.560 --> 00:02:23.200 So what i like about this system is, 02:23.200 --> 00:02:24.800 as I'm going through and reading 02:24.800 --> 00:02:26.720 watching the video of the lecture, 00:02:26.720 --> 00:02:29.599 I'm following along in the PDF notes here, 02:29.599 --> 00:02:31.680 and I'm taking my notes alongside them. 02:31.680 --> 00:02:34.400 So here's the first part of that lecture. 02:34.400 --> 00:02:36.319 You can't see at the bottom right now, 00:02:36.319 --> 00:02:38.800 but this is one of the earlier pages. 00:02:38.800 --> 00:02:42.400 I go to the second section here 00:02:42.400 --> 00:02:45.040 and you see that my notes 00:02:45.040 --> 00:02:46.640 for this part of the lecture, 02:46.640 --> 00:02:48.480 here, my notes here... 00:02:48.480 --> 00:02:49.599 I love how the notes 00:02:49.599 --> 00:02:50.959 for different parts of the lecture 00:02:50.959 --> 00:02:52.560 are coordinated with the different parts 02:52.560 --> 00:02:55.200 of the PDF that go along with the lecture. 02:55.200 --> 00:02:57.519 Now let's go back to the top of this 02:57.519 --> 00:03:01.840 and you'll see... First, you'll see my notes 03:01.840 --> 00:03:03.920 down here. I'll go into these 00:03:03.920 --> 00:03:06.319 a little bit more shortly, 00:03:06.319 --> 00:03:07.200 but one of the things 00:03:07.200 --> 00:03:08.959 that goes along with a lecture 00:03:08.959 --> 00:03:11.519 in a grad school class is these days 00:03:11.519 --> 00:03:13.680 in computer science citations 00:03:13.680 --> 00:03:14.640 for research papers 00:03:14.640 --> 00:03:16.480 that were expected to read. 03:16.480 --> 00:03:20.080 So here's one entitled MixMatch. 03:20.080 --> 00:03:22.319 I haven't downloaded this paper yet, 03:22.319 --> 00:03:24.238 so let's go. Take a look at that. 00:03:24.239 --> 00:03:26.319 So I use a keystroke to select 00:03:26.319 --> 00:03:28.480 the title of the paper 00:03:28.480 --> 00:03:30.239 and another keybinding 00:03:30.239 --> 00:03:31.440 to search for that paper 00:03:31.440 --> 00:03:33.519 on a website called arXiv. 03:33.519 --> 00:03:35.280 arXiv, if you're not familiar-- 00:03:35.280 --> 00:03:36.400 and here's a match-- 00:03:36.400 --> 00:03:37.680 arXiv, if you're not familiar, 00:03:37.680 --> 00:03:42.000 is an open research server 03:42.000 --> 00:03:43.760 where researchers publish papers 00:03:43.760 --> 00:03:45.040 before they're published in journals 00:03:45.040 --> 00:03:47.920 or conferences, and they are copyright-free 03:47.920 --> 00:03:50.159 and open to anyone to read. 00:03:50.159 --> 00:03:52.799 So here is the paper I was looking for. 03:52.799 --> 00:03:58.560 I copy this link into an Org mode link, 00:03:58.560 --> 00:03:59.840 and I come back to Emacs, 00:03:59.840 --> 00:04:02.400 and now another keystroke 04:02.400 --> 00:04:04.879 will revisit that website, 00:04:04.879 --> 00:04:06.400 pull down the PDF, and pull down 00:04:06.400 --> 00:04:08.400 all the information in the bibliography 00:04:08.400 --> 00:04:11.040 and put it into a bibliography here, 04:11.040 --> 00:04:13.599 inside my local bibliography. 00:04:13.599 --> 00:04:15.840 So here's the paper I was just looking at. 04:15.840 --> 00:04:17.840 Another great thing about a lot of PDFs 04:17.840 --> 00:04:20.320 is that they have an embedded outline 00:04:20.320 --> 00:04:24.160 that you can extract via the pdf-tools package. 04:24.160 --> 00:04:25.680 So now you see on the right here: 04:25.680 --> 00:04:27.360 introduction, related work, MixMatch, 04:27.360 --> 00:04:30.479 experiments. I can go right to that section, 04:30.479 --> 00:04:32.639 and this outline knows exactly 00:04:32.639 --> 00:04:33.759 which part of the PDF 00:04:33.759 --> 00:04:35.919 corresponds to each of the parts 00:04:35.919 --> 00:04:37.680 of this outline in the paper. 04:37.680 --> 00:04:40.240 So then, when I go take notes in here, 04:40.240 --> 00:04:41.280 just like in my other notes, 00:04:41.280 --> 00:04:43.040 it'll be coordinated with the PDF 00:04:43.040 --> 00:04:44.639 that goes along with it. 04:44.639 --> 00:04:48.080 So let's quit out of here. 00:04:48.080 --> 00:04:50.160 So now that I've captured that... 00:04:50.160 --> 00:04:53.199 Uh oh, this is the same paper. 04:53.199 --> 00:04:56.000 So now here I am back in my notes. 00:04:56.000 --> 00:04:58.000 now that I've captured this paper. 04:58.000 --> 00:05:02.400 What I'm going to do is make it a link, 05:02.400 --> 00:05:07.520 so the org-roam node that I just took 00:05:07.520 --> 00:05:09.600 will be here at the top. MixMatch. 05:09.600 --> 00:05:10.639 There's a little difference. 00:05:10.639 --> 00:05:13.120 You'll see here, this m is a different case 00:05:13.120 --> 00:05:16.240 than this m, and that's one of my to-do list. 00:05:16.240 --> 00:05:18.720 I'd like to make it so that this search 00:05:18.720 --> 00:05:20.320 is a little less case-sensitive. 00:05:20.320 --> 00:05:23.520 So now I've linked this link to this paper 00:05:23.520 --> 00:05:25.680 into these notes, and now these are... 00:05:25.680 --> 00:05:26.639 you'll see a little bit later 00:05:26.639 --> 00:05:29.360 how these links can be graphed and followed 00:05:29.360 --> 00:05:32.960 and so forth. While I'm in this document, 00:05:32.960 --> 00:05:33.680 I'd like to show you 00:05:33.680 --> 00:05:36.639 that when I'm learning something 05:36.639 --> 00:05:38.400 and I learn a new fact, 05:38.400 --> 00:05:40.320 I write down what I learned 00:05:40.320 --> 00:05:42.400 in the form of a question and an answer. 00:05:42.400 --> 00:05:45.039 So you can see here, there's a question 00:05:45.039 --> 00:05:46.800 that begins with who, what, where. 00:05:46.800 --> 00:05:49.360 It begins with a w word, or how, 05:49.360 --> 00:05:53.039 or if or is, and it ends in a question mark, 00:05:53.039 --> 00:05:54.960 and then following that is another string 00:05:54.960 --> 00:05:56.560 that ends in a period. 05:56.560 --> 00:05:58.240 So I have a... I'd like to do this 00:05:58.240 --> 00:05:59.280 in Emacs as well, but I haven't 00:05:59.280 --> 00:06:00.319 worked that out yet. 00:06:00.319 --> 00:06:04.639 I have a script that will... 06:04.639 --> 00:06:07.680 Let's find a-n-k-i-f. 06:07.680 --> 00:06:09.680 Okay, I have a script that will go through 00:06:09.680 --> 00:06:13.680 all the notes in my org-roam directory 06:13.680 --> 00:06:16.880 and find all the questions. 00:06:16.880 --> 00:06:20.720 Now let's pull up the most... 00:06:20.720 --> 00:06:24.319 No, don't edit the buffer. 06:24.319 --> 00:06:29.039 Save that. Come back to here. 06:29.039 --> 00:06:31.680 So now you can see that all the questions 00:06:31.680 --> 00:06:32.560 that I've written in my notes 06:32.560 --> 00:06:33.759 have now been ANKIFIED. 00:06:33.759 --> 00:06:34.880 Now what's that mean? 00:06:34.880 --> 00:06:40.960 Anki is this program here 06:40.960 --> 00:06:43.199 which is a flashcard system 00:06:43.199 --> 00:06:44.560 based on the idea... 00:06:44.560 --> 00:06:48.000 No, let's not download that right now. 00:06:48.000 --> 00:06:50.720 This is a system that enables 00:06:50.720 --> 00:06:53.120 the easy creation of flash cards 06:53.120 --> 00:06:54.479 that show you the front, 00:06:54.479 --> 00:06:55.360 show you the back, 00:06:55.360 --> 00:06:56.160 and then you decide 00:06:56.160 --> 00:07:00.000 if you knew that question or not. 07:00.000 --> 00:07:02.639 So I don't want to spend much time on this, 00:07:02.639 --> 00:07:04.639 but everything I'm learning in a class, 00:07:04.639 --> 00:07:06.800 I write into my notes as a question 00:07:06.800 --> 00:07:08.800 that I load into this flashcard system 00:07:08.800 --> 00:07:10.880 that then I can review on a walk, 00:07:10.880 --> 00:07:13.680 or on a bus ride, or whatever, 07:13.680 --> 00:07:16.400 and stay on top of indefinitely. 00:07:16.400 --> 00:07:17.440 As long as I can continue 07:17.440 --> 00:07:18.400 to keep reviewing that, 00:07:18.400 --> 00:07:20.639 I will keep that information 00:07:20.639 --> 00:07:22.319 fresh in my mind. 07:22.319 --> 00:07:24.479 So now let's come out of these files 07:24.479 --> 00:07:25.039 back to here. 00:07:25.039 --> 00:07:27.440 So I've demoed class note PDFs, 00:07:27.440 --> 00:07:29.440 grabbing papers from arXiv, 07:29.440 --> 00:07:31.199 autogenerating the skeletons 00:07:31.199 --> 00:07:32.720 and the flashcards, 00:07:32.720 --> 00:07:35.280 and now let's see what it looks like. 07:35.280 --> 00:07:40.160 Let's visualize the connections 07:40.160 --> 00:07:42.000 between these nodes. 07:42.000 --> 00:07:45.199 So here is a graph for the file 00:07:45.199 --> 00:07:46.319 I'm reading right now: 00:07:46.319 --> 00:07:49.520 One Effective Grad Student Workflow. 07:49.520 --> 00:07:53.599 Here is the node I have a link to 00:07:53.599 --> 00:07:54.639 in my Org mode document 07:54.639 --> 00:07:57.199 on spaced repetition. We can open that 00:07:57.199 --> 00:07:59.280 and come right back to Emacs, 07:59.280 --> 00:08:01.680 and I just love that. 08:01.680 --> 00:08:03.919 For the more complicated topics, 00:08:03.919 --> 00:08:05.520 you can see connections between things 00:08:05.520 --> 00:08:07.520 that you maybe didn't realize you had, 00:08:07.520 --> 00:08:10.240 and some of the notes you've taken. 00:08:10.240 --> 00:08:12.638 And so I'm getting near the end. 00:08:12.639 --> 00:08:15.120 I just want to show some small customizations. 08:15.120 --> 00:08:17.120 I save my org mode files 00:08:17.120 --> 00:08:18.479 that are in org-roam 00:08:18.479 --> 00:08:21.520 with a year year month month date prefix, 00:08:21.520 --> 00:08:24.639 so that I can tell when the node was created 00:08:24.639 --> 00:08:26.560 I also truncate them at 30 characters, 00:08:26.560 --> 00:08:27.919 so that when I do an ls, 00:08:27.919 --> 00:08:29.280 they don't word wrap. 00:08:29.280 --> 00:08:32.800 Maybe that's OCD. 08:32.800 --> 00:08:38.159 I also use an ID format that is year month 00:08:38.159 --> 00:08:40.479 day hour month hour minute second 08:40.479 --> 00:08:43.279 rather than the full UUID format 00:08:43.279 --> 00:08:44.720 because that number up there, 00:08:44.720 --> 00:08:46.160 that ID makes sense to me 00:08:46.160 --> 00:08:50.160 and it gives me an idea of when that node-- 08:50.160 --> 00:08:51.040 which you can, by the way, 00:08:51.040 --> 00:08:55.040 you can have--even one of these subheadings 00:08:55.040 --> 00:08:56.240 can be a node in org-roam. 00:08:56.240 --> 00:08:57.120 So now that you can see 08:57.120 --> 00:08:59.439 that was created right now. 08:59.440 --> 00:09:00.640 Some of the TODOs I still have 00:09:00.640 --> 00:09:02.720 in this system... We don't have to go 00:09:02.720 --> 00:09:04.000 too much into them, but I mentioned 00:09:04.000 --> 00:09:07.600 case insensitivity, and I'd like 00:09:07.600 --> 00:09:10.080 to make some improvements to org-noter. 00:09:10.080 --> 00:09:12.240 At this point, I'd just like to... 09:12.240 --> 00:09:14.959 I have a list of people I'd like to thank. 00:09:14.959 --> 00:09:16.240 I'm not gonna read the whole list out, 00:09:16.240 --> 00:09:17.680 but they're a bunch of software engineers 00:09:17.680 --> 00:09:20.399 that created great free software 00:09:20.399 --> 00:09:21.519 that's very useful to me 00:09:21.519 --> 00:09:23.839 and I use every day, so thank you to them, 00:09:23.839 --> 00:09:27.080 and thank you all for listening to my talk. 00:09:27.080 --> 00:09:28.080 [captions by sachac]