summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorEmacsConf <emacsconf-org@gnu.org>2023-12-02 13:50:17 -0500
committerEmacsConf <emacsconf-org@gnu.org>2023-12-02 13:50:17 -0500
commit5e5b46d8db74f12f8639c684082d4691eaffc030 (patch)
tree38d7e01d67ddc3390a43d0dd99d67ee8fa998802
parent8687f40e1d5af25092655bf0afacb5bdc1f247b9 (diff)
downloademacsconf-wiki-5e5b46d8db74f12f8639c684082d4691eaffc030.tar.xz
emacsconf-wiki-5e5b46d8db74f12f8639c684082d4691eaffc030.zip
Automated commit
-rw-r--r--2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main--chapters.vtt23
-rw-r--r--2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.vtt1176
-rw-r--r--2023/info/collab-after.md395
-rw-r--r--2023/info/collab-before.md11
4 files changed, 1604 insertions, 1 deletions
diff --git a/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main--chapters.vtt b/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main--chapters.vtt
new file mode 100644
index 00000000..dca4982e
--- /dev/null
+++ b/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main--chapters.vtt
@@ -0,0 +1,23 @@
+WEBVTT
+
+
+00:00:00.000 --> 00:01:16.079
+Introduction
+
+00:01:16.080 --> 00:02:18.959
+Org Mode
+
+00:02:18.960 --> 00:06:27.839
+Working together
+
+00:06:27.840 --> 00:08:04.039
+Data cleaning
+
+00:08:04.040 --> 00:12:36.039
+Processing
+
+00:12:36.040 --> 00:14:01.759
+Visualization
+
+00:14:01.760 --> 00:19:07.280
+Preserve
diff --git a/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.vtt b/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.vtt
new file mode 100644
index 00000000..1dcc0b22
--- /dev/null
+++ b/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.vtt
@@ -0,0 +1,1176 @@
+WEBVTT captioned by amine, checked by sachac
+
+NOTE Introduction
+
+00:00.000 --> 00:00:01.874
+[Lukas]: Welcome to our presentation,
+
+00:00:01.875 --> 00:00:03.599
+Collaborative Data Processing
+
+00:03.600 --> 00:06.039
+and Documenting using org-babel.
+
+00:06.040 --> 00:07.759
+My name is Lukas Bossert, and I'm
+
+00:07.760 --> 00:00:09.740
+from the RWTH Aachen University
+
+00:00:09.741 --> 00:00:12.519
+in the city of Aachen, Germany.
+
+00:12.520 --> 00:14.839
+[Jonathan]: And my name is Jonathan Hartmann.
+
+00:14.840 --> 00:18.719
+I'm also from the IT Center here at RWTH Aachen.
+
+00:18.720 --> 00:19.239
+[Lukas]: Great.
+
+00:19.240 --> 00:21.679
+And we will show you today how you
+
+00:21.680 --> 00:25.399
+can use Org Mode for data processing.
+
+00:25.400 --> 00:27.999
+So you see a little workflow what we are going to do.
+
+00:28.000 --> 00:31.199
+First, we will give you a slight introduction to Org Mode.
+
+00:31.200 --> 00:34.639
+Then we will dive into the part of data preparing.
+
+00:34.640 --> 00:38.679
+First, you're going to query the data using the language SPARQL.
+
+00:38.680 --> 00:41.759
+Then we're going to clean it using a different language.
+
+00:41.760 --> 00:44.279
+And in the main part of our presentation,
+
+00:44.280 --> 00:48.119
+we're going to do the data processing, first aggregating
+
+00:48.120 --> 00:52.519
+using Python, later on counting items using Org,
+
+00:52.520 --> 00:56.360
+and even visualizing it using R. At the end,
+
+00:56.400 --> 00:58.959
+we're going to show you how to preserve
+
+00:58.960 --> 01:01.759
+the data and the document and its documentation,
+
+01:01.760 --> 01:06.599
+first doing in plain exporting, then adding some metadata,
+
+01:06.600 --> 01:09.759
+and showing you two different ways, first a manual export,
+
+01:09.760 --> 01:13.359
+and also then a batch-processed export.
+
+01:13.360 --> 01:14.239
+All right.
+
+01:14.240 --> 01:16.079
+Let's dive in to that.
+
+NOTE Org Mode
+
+01:16.080 --> 01:19.919
+Jonathan, can you give us an introduction about Org Mode?
+
+01:19.920 --> 01:20.439
+[Jonathan]: Of course.
+
+01:20.440 --> 01:23.079
+So in case anyone isn't familiar with it,
+
+01:23.080 --> 01:25.879
+Org Mode, in the words of Carsten Dominik,
+
+01:25.880 --> 01:28.559
+is back to the future for plain text.
+
+01:28.560 --> 01:31.439
+So this is just a module available for Emacs,
+
+01:31.440 --> 01:32.519
+plain-text base.
+
+01:32.520 --> 01:34.919
+It's been around since 2003, which
+
+01:34.920 --> 01:36.799
+makes it about 20 years old.
+
+01:36.800 --> 01:40.159
+And it's extensible and fully customizable.
+
+01:40.160 --> 01:43.999
+And especially, it's very convenient, very good
+
+01:44.000 --> 01:46.719
+for scientific text production and organization.
+
+01:46.720 --> 01:49.439
+So for example, you can do project management, agenda,
+
+01:49.440 --> 01:52.559
+diary, journaling, personal knowledge management,
+
+01:52.560 --> 01:53.359
+presentation.
+
+01:53.360 --> 01:55.520
+Even this is written in Org Mode.
+
+01:55.560 --> 01:57.439
+It's an Org Mode presentation.
+
+01:57.440 --> 01:59.199
+You can do single source publishing,
+
+01:59.200 --> 02:01.679
+which we will do later on, and also
+
+02:01.680 --> 02:06.479
+literate programming, which is the core of our talk.
+
+02:06.480 --> 02:06.999
+OK.
+
+02:07.000 --> 02:10.799
+[Lukas]: So let me stop this presentation here.
+
+02:10.800 --> 02:14.719
+So what you see here is the plain text underneath it.
+
+02:14.720 --> 02:18.959
+So this is Org Mode.
+
+NOTE Working together
+
+02:18.960 --> 02:21.919
+And Jonathan, since we kind of already
+
+02:21.920 --> 02:25.320
+did the introduction together, should we
+
+02:26.120 --> 00:02:28.760
+also do the working part together?
+
+00:02:28.761 --> 00:02:29.700
+[Jonathan]: Of course.
+
+00:02:29.701 --> 00:02:33.119
+So you see on the screen there on the right,
+
+00:02:33.120 --> 00:02:35.060
+that's my screen in Emacs.
+
+00:02:35.061 --> 00:02:39.520
+And Lukas, why don't you host a session using CRDT,
+
+00:02:39.521 --> 00:02:41.200
+and I'll connect to your buffer.
+
+00:02:41.201 --> 00:02:42.560
+[Lukas]: OK. Great.
+
+00:02:42.561 --> 00:02:43.280
+I do that.
+
+00:02:43.281 --> 00:02:46.180
+So what I do, I'm using Doom Emacs.
+
+00:02:46.181 --> 00:02:49.307
+And I can use the `SPC` and then the `l`
+
+00:02:49.308 --> 00:02:52.140
+for the live share/collab part.
+
+00:02:52.141 --> 02:57.999
+I can use the `s` for share current buffer.
+
+02:58.000 --> 00:03:01.559
+So when I do this, I'm getting asked for some settings.
+
+00:03:01.560 --> 00:03:04.439
+I'm going with the default settings here.
+
+00:03:04.440 --> 00:03:08.340
+So default port, no password, and my display name.
+
+00:03:08.341 --> 00:03:11.940
+And now Emacs is connecting.
+
+00:03:11.941 --> 00:03:15.179
+And once it's connected, which just takes a couple of seconds,
+
+00:03:15.180 --> 00:03:17.239
+I can get the URL.
+
+00:03:17.240 --> 03:20.800
+So I'm going back to this menu and using `y`
+
+03:21.160 --> 03:23.999
+for copying the URL of the current session.
+
+03:24.000 --> 03:27.799
+And this is the URL I'm going to send over to you, Jonathan,
+
+03:27.800 --> 03:29.079
+to pick that up.
+
+03:29.080 --> 03:29.599
+[Jonathan]: Right.
+
+03:29.600 --> 03:30.079
+OK.
+
+03:30.080 --> 00:03:36.999
+And now on my screen, I'm going to do a `SPC l c` for connect.
+
+00:03:37.000 --> 00:03:38.740
+And I'm going to paste the URL
+
+00:03:38.741 --> 00:03:40.040
+that Lukas just sent me in here.
+
+00:03:40.980 --> 03:43.719
+Default port, no password.
+
+03:43.720 --> 00:03:45.440
+And we're connecting now.
+
+00:03:45.700 --> 03:48.600
+So this takes a second just to get us synced up.
+
+03:51.600 --> 00:03:54.160
+So we can work on the same document at the same time.
+
+00:03:54.161 --> 03:56.639
+We can follow each other's cursors around.
+
+03:56.640 --> 03:58.839
+We can have multiple buffers open and work on them
+
+03:58.840 --> 04:00.999
+at the same time.
+
+04:01.000 --> 04:04.719
+And so here you see that we are both in the same document.
+
+04:04.720 --> 04:06.280
+You can see my cursor popping around.
+
+04:09.040 --> 04:13.279
+And you can see we're both editing the same item.
+
+04:13.280 --> 04:14.039
+Great.
+
+04:14.040 --> 04:18.039
+[Lukas]: So we also see who else is currently in our buffer
+
+04:18.040 --> 04:20.199
+with the user overview.
+
+04:20.200 --> 04:23.559
+So let me just delete that window.
+
+04:23.560 --> 04:26.079
+And that's going to work in our main one.
+
+04:26.080 --> 04:29.599
+So we said first part is about data retrieval.
+
+04:29.600 --> 04:32.720
+So we should give it a headline.
+
+04:37.080 --> 04:39.239
+We said prepare stage.
+
+04:39.240 --> 04:42.319
+So what are we going to do first, Jonathan?
+
+04:42.320 --> 00:04:43.940
+[Jonathan]: So what we're going to do,
+
+00:04:43.941 --> 00:04:45.399
+what this whole document is based upon,
+
+04:45.400 --> 04:50.119
+is we're going to pull data from Wikidata using a SPARQL query.
+
+04:50.120 --> 04:53.519
+The data we're going to pull is related to the NFDIs,
+
+04:53.520 --> 04:55.639
+which here in Germany is the National Forschungsdaten
+
+04:55.640 --> 05:00.679
+Infrastructure, which is a sort of collection of universities
+
+05:00.680 --> 05:03.399
+that work together on various research projects.
+
+05:03.400 --> 05:05.599
+And this is emblematic of the kind of data
+
+05:05.600 --> 05:09.239
+that we would be interested in working with here.
+
+05:09.240 --> 05:13.359
+So I'm going to paste a--forgive the pre-written code--
+
+05:13.360 --> 05:19.840
+I'm going to paste some text in here.
+
+05:20.040 --> 00:05:21.407
+[Lukas]: And while you are talking, I just
+
+00:05:21.408 --> 00:05:23.359
+keep on documenting what we do
+
+00:05:23.360 --> 00:05:25.880
+so we can split the work.
+
+05:27.360 --> 05:29.679
+[Jonathan]: In here, after a minor technical upset,
+
+05:29.680 --> 05:32.559
+is the raw dataset cell.
+
+05:32.560 --> 00:05:34.740
+And it's going to use SPARQL,
+
+00:05:34.741 --> 00:05:37.174
+which is how we have the syntax highlighting
+
+00:05:37.175 --> 00:05:37.940
+in our code here.
+
+00:05:37.941 --> 05:40.639
+It's going to go to the URL endpoint
+
+05:40.640 --> 05:43.639
+query.wikidata.org/sparql ,
+
+05:43.640 --> 05:46.799
+and it's going to return the data as a text CSV,
+
+05:46.800 --> 05:49.279
+and it's going to cache that data
+
+05:49.280 --> 05:51.439
+so that we don't constantly hammer the API every time
+
+05:51.440 --> 05:54.239
+we run this notebook.
+
+05:54.240 --> 00:05:57.360
+So I'm going to run that there.
+
+00:05:57.361 --> 05:58.799
+You can see down at the bottom of my screen,
+
+05:58.800 --> 06:00.840
+we're contacting the host query.wikidata.org .
+
+06:05.720 --> 06:07.319
+[Lukas]: And there's the result.
+
+06:07.320 --> 06:11.799
+[Jonathan]: Yeah, except I think that for our purposes here,
+
+06:11.800 --> 06:15.279
+we're just going to limit this to 50 results.
+
+06:15.280 --> 06:16.279
+[Lukas]: Oh, yeah.
+
+06:16.280 --> 06:18.679
+[Jonathan]: Just so it's a little easier for us to manage.
+
+06:18.680 --> 06:20.719
+I'm going to run that again.
+
+06:20.720 --> 06:21.519
+There we go.
+
+06:21.520 --> 00:06:22.319
+That looks a little better.
+
+00:06:22.320 --> 00:06:23.159
+[Lukas]: I think that's fine.
+
+00:06:23.160 --> 00:06:25.359
+50 items is fine.
+
+00:06:25.360 --> 06:27.839
+So what do we see here, Jonathan?
+
+NOTE Data cleaning
+
+06:27.840 --> 06:28.319
+[Jonathan]: Right.
+
+06:28.320 --> 06:31.239
+So the first thing we see when we look at this
+
+06:31.240 --> 00:06:33.307
+is a couple of Q codes at the top,
+
+00:06:33.308 --> 00:06:36.079
+which are an artifact of Wikidata.
+
+06:36.080 --> 06:39.519
+So these are pages which don't have
+
+06:39.520 --> 06:42.519
+the label for whichever institution they happen to be.
+
+06:42.520 --> 06:45.919
+For our purposes here, we're just going to exclude them.
+
+06:45.920 --> 06:48.199
+We could just go on Wikidata and edit them ourselves.
+
+06:48.200 --> 06:50.399
+But for now, it's a little more interesting
+
+06:50.400 --> 06:52.519
+if we go and remove them.
+
+06:52.520 --> 06:55.159
+So I'm going to create a new cell.
+
+06:55.160 --> 06:58.279
+Lukas, if you don't mind starting one for data cleaning.
+
+06:58.280 --> 06:58.879
+[Lukas]: Oh, yeah.
+
+06:58.880 --> 06:59.479
+Good point.
+
+06:59.480 --> 07:02.039
+Yeah, data cleaning.
+
+07:02.040 --> 07:03.439
+OK.
+
+07:03.440 --> 00:07:05.499
+How do you want to do that, Jonathan?
+
+00:07:05.500 --> 07:09.759
+[Jonathan]: I'm going to use a shell command.
+
+07:09.760 --> 07:11.119
+So let's see.
+
+07:11.120 --> 07:12.999
+There we go.
+
+07:13.000 --> 07:15.159
+And so you can see, here is another cell,
+
+07:15.160 --> 07:20.039
+that the cell is now using a shell,
+
+07:20.040 --> 00:07:23.799
+and that we have this thing `:var input=raw-dataset`,
+
+00:07:23.800 --> 00:07:25.840
+which is the name of the cell above
+
+00:07:25.841 --> 00:07:28.439
+where we got our data from Wikidata.
+
+07:28.440 --> 07:31.679
+This is going to run just a simple shell command.
+
+07:31.680 --> 07:33.959
+It's going to take the input and then run `sed` on it
+
+07:33.960 --> 00:07:37.039
+and exclude any records which have a Q
+
+00:07:37.040 --> 00:07:41.279
+followed by one or more digits afterwards.
+
+07:41.280 --> 07:43.960
+That should remove those from our data set.
+
+07:44.000 --> 07:45.400
+So I'm going to run that.
+
+07:48.640 --> 07:51.039
+That seems to have done the trick.
+
+07:51.040 --> 07:51.879
+[Lukas]: Great, yeah.
+
+07:51.880 --> 07:52.919
+That's really good.
+
+07:52.920 --> 07:55.399
+We got rid of all the Q items.
+
+07:55.400 --> 07:55.919
+Very good.
+
+07:55.920 --> 07:59.959
+So we just have two-column table: institutions
+
+07:59.960 --> 08:02.759
+and consortia.
+
+08:02.760 --> 08:04.039
+Very nice.
+
+NOTE Processing
+
+08:04.040 --> 08:08.719
+So let's come to our main part, doing some processing.
+
+08:08.720 --> 08:13.560
+Let me give you a headline here, process the data.
+
+08:13.640 --> 08:15.519
+What do you want to do first?
+
+08:15.520 --> 08:17.599
+[Jonathan]: This is not a very complicated data set,
+
+08:17.600 --> 08:19.439
+but let's just do some simple counts first.
+
+08:19.440 --> 08:22.199
+I'm going to start with Python,
+
+08:22.200 --> 08:25.239
+and we're just going to do some aggregation with Python.
+
+08:25.240 --> 08:30.039
+Again, I've got some pre-written code here.
+
+08:30.040 --> 08:34.999
+You can see that we've started a cell using Python.
+
+08:35.000 --> 08:37.879
+The variable `clean_df` now is equal to `clean-dataset`.
+
+08:37.880 --> 00:08:39.707
+So we're going to take that data
+
+00:08:39.708 --> 00:08:41.039
+that we retrieved from the SPARQL query,
+
+08:41.040 --> 08:42.680
+we're going to run it through the cleaning cell,
+
+08:42.720 --> 08:45.239
+and then we're going to import it into this cell.
+
+08:45.240 --> 08:47.839
+This is just going to do some simple Python aggregation.
+
+08:47.840 --> 00:08:49.007
+We're going to import `pandas`,
+
+00:08:49.008 --> 00:08:51.307
+which is the Python data science library,
+
+00:08:51.308 --> 00:08:54.839
+create a data frame out of our input,
+
+08:54.840 --> 08:57.479
+and then aggregate it, grouping on `wLabel`,
+
+08:57.480 --> 08:59.959
+and getting a count from that and returning it.
+
+08:59.960 --> 09:01.640
+So if we execute that cell...
+
+09:05.040 --> 09:08.879
+[Lukas]: Nice, we get institutions and a count.
+
+09:08.880 --> 09:14.119
+But what about not ordering it by the alphabet,
+
+09:14.120 --> 09:17.079
+but more like ordering by counts?
+
+09:17.080 --> 09:18.439
+[Jonathan]: Sure.
+
+09:18.440 --> 09:22.839
+So let's do this... `sort_values()`, I think, as the Python.
+
+09:22.840 --> 09:24.919
+How does that look?
+
+09:24.920 --> 00:09:27.640
+[Lukas]: Better, but I would like to
+
+00:09:27.641 --> 00:09:29.239
+have the highest number first
+
+09:29.240 --> 09:32.239
+and then ascending.
+
+09:32.240 --> 09:34.719
+Well, not ascending, descending.
+
+09:34.720 --> 09:37.600
+[Jonathan]: Right, so we can do `ascending=False`.
+
+09:39.880 --> 09:42.559
+[Lukas]: This is perfect, I'd say.
+
+09:42.560 --> 09:43.079
+[Jonathan]: Great.
+
+09:43.080 --> 09:44.079
+[Lukas]: Very good.
+
+09:44.080 --> 00:09:46.799
+OK, that's nice.
+
+00:09:46.800 --> 09:47.999
+We get a good overview here.
+
+09:48.000 --> 09:50.079
+But can we also do something else,
+
+09:50.080 --> 09:56.079
+like counting how many institutions are
+
+09:56.080 --> 09:57.799
+involved in one consortium?
+
+09:57.800 --> 10:00.879
+And also using this later on in the text?
+
+10:00.880 --> 00:10:00.880
+[Jonathan]: Sure, so I'm going to put a new...
+
+00:10:00.881 --> 00:10:05.040
+If you give me another heading down here
+
+00:10:05.041 --> 00:10:08.320
+for institutions per consortium...
+
+10:12.080 --> 10:16.799
+And here we're going to use awk code just to spice things up
+
+10:16.800 --> 10:18.959
+and add yet another language in here.
+
+10:18.960 --> 10:22.439
+So you can see this is awk.
+
+10:22.440 --> 10:26.279
+We're using standard in instead of defining a variable.
+
+10:26.280 --> 10:28.359
+But the really interesting thing about this cell
+
+10:28.360 --> 00:10:33.399
+is that we have this `:var consortium="NFDI4Memory"`.
+
+10:33.400 --> 00:10:35.640
+And what this code is doing is
+
+00:10:35.641 --> 00:10:38.040
+it's counting any time it sees
+
+00:10:38.041 --> 00:10:40.279
+that particular consortium name
+
+10:40.280 --> 10:41.759
+and keeping track of that.
+
+10:41.760 --> 00:10:43.907
+So if we execute this,
+
+00:10:43.908 --> 00:10:45.919
+Lukas, why don't you execute this one?
+
+10:45.920 --> 10:49.399
+[Lukas]: OK, I'm going to enter it.
+
+10:49.400 --> 10:52.439
+And I get a result, NFDI4Memory,
+
+10:52.440 --> 10:58.239
+because this is our default value for this variable.
+
+10:58.240 --> 10:59.439
+And we get the count.
+
+10:59.440 --> 00:11:01.640
+So it's five institutions are involved
+
+00:11:01.641 --> 00:11:04.639
+in the NFDI4memory consortium.
+
+11:04.640 --> 11:07.839
+Great, but the very nice thing, what I think,
+
+11:07.840 --> 11:12.519
+is here that we can use this code snippet within our text.
+
+11:12.520 --> 11:14.279
+So, blended in seamlessly.
+
+11:14.280 --> 11:16.199
+Let me give you an example.
+
+11:16.200 --> 11:18.919
+I'm writing out the text.
+
+11:18.920 --> 11:27.599
+Now we know how many institutions are in...
+
+11:27.600 --> 11:29.239
+Give me an example.
+
+11:29.240 --> 11:31.480
+I would like to know how many institutions are
+
+11:31.560 --> 11:35.079
+involved in NFDI4Objects, which is a consortium.
+
+11:35.080 --> 11:39.239
+So I'm writing `call_` and using
+
+11:39.240 --> 00:11:42.607
+the name of this snippet here, of this cell,
+
+00:11:42.608 --> 00:11:46.607
+which is `inst-count(`,
+
+00:11:46.608 --> 00:11:51.719
+and writing my value, `NFDI4Objects`.
+
+11:51.720 --> 11:57.999
+As soon as I evaluate this using `C-c C-c`,
+
+11:58.000 --> 12:00.279
+I get the result back here.
+
+12:00.280 --> 12:05.159
+I can do this even for more.
+
+12:05.160 --> 12:14.039
+Or in writing, `call_inst-count`, go with `NFDI4Earth`,
+
+12:14.040 --> 12:16.799
+which is another consortium.
+
+12:16.800 --> 12:20.559
+`C-c C-c`, it's three institutions.
+
+12:20.560 --> 12:23.439
+This can be used throughout your text,
+
+12:23.440 --> 12:26.639
+and as soon as the data set changes from in the beginning,
+
+12:26.640 --> 12:30.399
+maybe different results requiring Wikidata,
+
+12:30.400 --> 12:35.079
+this also will be updated once it's exported.
+
+12:35.080 --> 12:36.039
+Very nice, Jonathan.
+
+NOTE Visualization
+
+12:36.040 --> 00:12:38.974
+But I think we did a lot of analysis
+
+00:12:38.975 --> 00:12:41.079
+on text and counting things.
+
+12:41.080 --> 12:43.679
+Can we also do something more visual?
+
+12:43.680 --> 12:45.199
+Show me something.
+
+12:45.200 --> 12:45.759
+[Jonathan]: Sure.
+
+12:45.760 --> 12:48.639
+So what we can do with this, because we just
+
+12:48.640 --> 12:51.399
+have two columns here that are sort of related,
+
+12:51.400 --> 12:53.759
+we can build a little network plot out of it.
+
+12:53.760 --> 12:56.999
+So let's make a network visualization.
+
+12:57.000 --> 12:59.599
+We're going to use the `igraph` library from R
+
+12:59.600 --> 13:02.559
+and just plot the edges that we see here.
+
+13:02.560 --> 13:04.239
+There we go.
+
+13:04.240 --> 13:11.879
+There's my little heading and space.
+
+13:11.880 --> 13:13.479
+Here is our code.
+
+13:13.480 --> 13:16.039
+Again, just to be fancy and keep using
+
+13:16.040 --> 13:19.719
+different languages in here, we set a variable called
+
+13:19.720 --> 13:21.560
+`NFDI_edges` equal to `clean-dataset`.
+
+13:21.600 --> 13:23.399
+So this, again, is sort of cascading
+
+13:23.400 --> 00:13:25.740
+through the original data
+
+00:13:25.741 --> 00:13:28.807
+that we pulled from the Wikidata endpoint,
+
+00:13:28.808 --> 00:13:30.959
+cleaning that data, and now it's being inserted
+
+13:30.960 --> 13:32.959
+into this cell as well.
+
+13:32.960 --> 13:34.239
+But you see the difference here.
+
+13:34.240 --> 13:36.839
+Instead of exporting a table, what we're saying
+
+13:36.840 --> 13:39.239
+is that there will be a graphics file,
+
+13:39.240 --> 13:44.639
+and it will be called network-plot.png.
+
+13:44.640 --> 13:45.119
+All right.
+
+13:45.120 --> 13:47.959
+And so Lukas, why don't you execute this one?
+
+13:47.960 --> 13:48.759
+[Lukas]: There you go.
+
+13:48.760 --> 13:52.919
+I can click `C-c C-c`
+
+13:52.920 --> 13:59.159
+and I get a nice plot of the network below our cell.
+
+13:59.160 --> 14:01.759
+So this is very nice indeed.
+
+NOTE Preserve
+
+14:01.760 --> 14:05.199
+So I think it's about time to wrap it up and to export
+
+14:05.200 --> 14:07.959
+and to preserve the data and the documentation
+
+14:07.960 --> 14:13.079
+that we have in our very last step, calling preserve.
+
+14:13.080 --> 14:16.239
+So I would like to do it in two steps.
+
+14:16.240 --> 14:18.600
+First, maybe manually exporting it,
+
+14:18.800 --> 14:22.239
+but then also doing it in a batch process.
+
+14:22.240 --> 14:27.119
+Giving you some insights how to do that manual export.
+
+14:27.120 --> 14:30.559
+For example, you can do a LaTeX export.
+
+14:30.560 --> 14:34.279
+Let me write down the key combination to do that here.
+
+14:34.280 --> 14:44.560
+So you press `SPC m e l o`.
+
+14:44.600 --> 14:49.159
+Let me show you how this is done.
+
+14:49.160 --> 14:51.439
+So I'm pressing `SPC`.
+
+14:51.440 --> 14:55.679
+I'm pressing `m`, which is my local leader.
+
+14:55.680 --> 15:01.279
+I'm pressing `e`, which is now the `org-export-dispatch`.
+
+15:01.280 --> 15:03.519
+And now I have different options I can choose from.
+
+15:03.520 --> 15:07.119
+I want to do a LaTeX export because I want to get in PDF.
+
+15:07.120 --> 00:15:08.674
+So I'm pressing `l`.
+
+00:15:08.675 --> 00:15:11.479
+Now I've got different options available.
+
+15:11.480 --> 15:17.399
+So I'm pressing `o` for a PDF file and open that.
+
+15:17.400 --> 15:21.119
+Let's see now the code.
+
+15:21.120 --> 15:25.639
+Now this is exporting document.
+
+15:25.640 --> 00:15:29.674
+And what we have here is PDF,
+
+00:15:29.675 --> 00:15:31.974
+which contains our workflow in the beginning,
+
+00:15:31.975 --> 00:15:35.707
+our bullet points we have here,
+
+00:15:35.708 --> 00:15:37.919
+and also the code snippet
+
+15:37.920 --> 15:41.120
+that we use for querying the data.
+
+15:41.280 --> 15:43.599
+And we have the result below that.
+
+15:43.600 --> 15:46.999
+So this is our table with all the data sets.
+
+15:47.000 --> 15:51.879
+But as you can see, this is running out of the page.
+
+15:51.880 --> 15:55.679
+So this is not very nice using the default settings.
+
+15:55.680 --> 16:00.239
+But everything is in this PDF.
+
+16:00.240 --> 16:02.759
+I guess we can now show you a way
+
+16:02.760 --> 16:06.519
+how to improve this result.
+
+16:06.520 --> 16:07.039
+[Jonathan]: Right.
+
+16:07.040 --> 16:09.399
+So we have, of course, a version of this
+
+16:09.400 --> 00:16:10.774
+that we prepared ahead of time,
+
+00:16:10.775 --> 00:16:14.279
+which is more or less identical to the one we just made,
+
+16:14.280 --> 16:17.839
+but it has a little more text, a little more explanation,
+
+16:17.840 --> 16:20.559
+a little more documentation along with the code.
+
+16:20.560 --> 16:23.879
+You can see we have some metadata up at the top,
+
+16:23.880 --> 16:26.879
+the title, the authors, a bibliography,
+
+16:26.880 --> 16:31.679
+and most importantly, the `custom-export.setup` file,
+
+16:31.680 --> 16:36.879
+which lists specifically the sort of LaTeX commands
+
+16:36.880 --> 16:43.599
+that we're using and the HTML styles that we're going to use.
+
+16:43.600 --> 16:45.919
+And then down at the bottom of this file,
+
+16:45.920 --> 16:49.119
+we have our automatic batch process.
+
+16:49.120 --> 16:51.719
+Here is one more language we're including in here.
+
+16:51.720 --> 16:53.439
+So this is Lisp.
+
+16:53.440 --> 16:57.359
+And you can see here we are exporting to HTML, ASCII,
+
+16:57.360 --> 16:58.079
+and PDF.
+
+16:58.080 --> 17:01.359
+The nice thing about this is that this is a document.
+
+17:01.360 --> 00:17:03.307
+It's a sort of document that we have a couple of
+
+00:17:03.308 --> 00:17:08.639
+that we can have running automatically and building.
+
+17:08.640 --> 17:12.919
+It will export a HTML, an ASCII file, and a PDF file
+
+17:12.920 --> 00:17:14.674
+every time it's run based off of
+
+00:17:14.675 --> 00:17:17.319
+the most recent data available on Wikidata.
+
+17:17.320 --> 17:19.719
+So it's self-documenting.
+
+17:19.720 --> 00:17:22.440
+We have, of course, our data retrieval steps,
+
+00:17:22.441 --> 00:17:25.159
+our data cleaning steps, our data preparation steps,
+
+17:25.160 --> 17:28.359
+and our preservation steps all listed at the same time.
+
+17:28.360 --> 17:30.239
+And then you can see over on the right,
+
+17:30.240 --> 17:34.320
+there's an example of the HTML file that we get out of this.
+
+17:34.360 --> 17:37.639
+We also get a very nicely formatted PDF file,
+
+17:37.640 --> 17:39.239
+which doesn't have that little issue
+
+17:39.240 --> 17:41.719
+with the overflow of the table.
+
+17:41.720 --> 17:43.559
+It's very nicely put together.
+
+17:43.560 --> 17:46.199
+And we even have an ASCII file.
+
+17:46.200 --> 17:47.879
+And I should also point out very quickly,
+
+17:47.880 --> 17:51.799
+while you have this one up, Lukas, after the awk code,
+
+17:51.800 --> 17:56.079
+you can see the text for the number of consortia,
+
+17:56.080 --> 17:57.839
+or the number of institutions per consortia
+
+17:57.840 --> 18:00.519
+is actually printed inline.
+
+18:00.520 --> 18:01.799
+[Lukas]: Yeah, you're very right.
+
+18:01.800 --> 18:06.119
+So this is what we had as code,
+
+18:06.120 --> 18:10.719
+and now this is nicely integrated into our text.
+
+18:10.720 --> 18:15.279
+So we got the consortium and number of institutions.
+
+18:15.280 --> 18:19.199
+You can't tell a difference between code and text.
+
+18:19.200 --> 18:20.719
+[Jonathan]: And those are automatically updated.
+
+18:20.720 --> 18:23.879
+So if another institution joins NFDI4Earth,
+
+18:23.880 --> 18:26.319
+then the next time this runs, we update the text right here.
+
+18:26.320 --> 18:28.519
+It's nothing we have to worry about.
+
+18:28.520 --> 18:30.400
+We just pull it directly out of Wikidata.
+
+18:31.840 --> 18:34.679
+[Lukas]: And for the sake of completeness,
+
+18:34.680 --> 18:37.879
+this is the ASCII file.
+
+18:37.880 --> 18:39.320
+That's in the export format.
+
+18:42.760 --> 18:46.440
+It contains also everything, code and data.
+
+18:48.360 --> 18:51.680
+Yeah, so this is what we wanted to show you,
+
+18:53.240 --> 18:56.639
+how to do some data processing,
+
+18:56.640 --> 18:58.679
+some collaborative work,
+
+18:58.680 --> 19:01.119
+documenting using org-babel.
+
+19:01.120 --> 19:03.960
+Thanks for listening.
+
+19:05.720 --> 19:07.280
+[Jonathan]: Thank you all, have a good day.
diff --git a/2023/info/collab-after.md b/2023/info/collab-after.md
index 1e42d375..82cfe17e 100644
--- a/2023/info/collab-after.md
+++ b/2023/info/collab-after.md
@@ -1,6 +1,401 @@
<!-- Automatically generated by emacsconf-publish-after-page -->
+<a name="collab-mainVideo-transcript"></a>
+# Transcript
+
+[[!template new="1" text="""[Lukas]: Welcome to our presentation,""" start="00:00:00.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Collaborative Data Processing""" start="00:00:01.875" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and Documenting using org-babel.""" start="00:00:03.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""My name is Lukas Bossert, and I'm""" start="00:00:06.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""from the RWTH Aachen University""" start="00:00:07.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""in the city of Aachen, Germany.""" start="00:00:09.741" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: And my name is Jonathan Hartmann.""" start="00:00:12.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm also from the IT Center here at RWTH Aachen.""" start="00:00:14.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Great.""" start="00:00:18.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And we will show you today how you""" start="00:00:19.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""can use Org Mode for data processing.""" start="00:00:21.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So you see a little workflow what we are going to do.""" start="00:00:25.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""First, we will give you a slight introduction to Org Mode.""" start="00:00:28.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Then we will dive into the part of data preparing.""" start="00:00:31.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""First, you're going to query the data using the language SPARQL.""" start="00:00:34.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Then we're going to clean it using a different language.""" start="00:00:38.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And in the main part of our presentation,""" start="00:00:41.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we're going to do the data processing, first aggregating""" start="00:00:44.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""using Python, later on counting items using Org,""" start="00:00:48.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and even visualizing it using R. At the end,""" start="00:00:52.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we're going to show you how to preserve""" start="00:00:56.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""the data and the document and its documentation,""" start="00:00:58.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""first doing in plain exporting, then adding some metadata,""" start="00:01:01.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and showing you two different ways, first a manual export,""" start="00:01:06.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and also then a batch-processed export.""" start="00:01:09.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""All right.""" start="00:01:13.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Let's dive in to that.""" start="00:01:14.240" video="mainVideo-collab" id="subtitle"]]
+[[!template new="1" text="""Jonathan, can you give us an introduction about Org Mode?""" start="00:01:16.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Of course.""" start="00:01:19.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So in case anyone isn't familiar with it,""" start="00:01:20.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Org Mode, in the words of Carsten Dominik,""" start="00:01:23.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is back to the future for plain text.""" start="00:01:25.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is just a module available for Emacs,""" start="00:01:28.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""plain-text base.""" start="00:01:31.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's been around since 2003, which""" start="00:01:32.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""makes it about 20 years old.""" start="00:01:34.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And it's extensible and fully customizable.""" start="00:01:36.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And especially, it's very convenient, very good""" start="00:01:40.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""for scientific text production and organization.""" start="00:01:44.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So for example, you can do project management, agenda,""" start="00:01:46.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""diary, journaling, personal knowledge management,""" start="00:01:49.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""presentation.""" start="00:01:52.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Even this is written in Org Mode.""" start="00:01:53.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's an Org Mode presentation.""" start="00:01:55.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""You can do single source publishing,""" start="00:01:57.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which we will do later on, and also""" start="00:01:59.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""literate programming, which is the core of our talk.""" start="00:02:01.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""OK.""" start="00:02:06.480" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: So let me stop this presentation here.""" start="00:02:07.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So what you see here is the plain text underneath it.""" start="00:02:10.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is Org Mode.""" start="00:02:14.720" video="mainVideo-collab" id="subtitle"]]
+[[!template new="1" text="""And Jonathan, since we kind of already""" start="00:02:18.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""did the introduction together, should we""" start="00:02:21.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""also do the working part together?""" start="00:02:26.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Of course.""" start="00:02:28.761" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So you see on the screen there on the right,""" start="00:02:29.701" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that's my screen in Emacs.""" start="00:02:33.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And Lukas, why don't you host a session using CRDT,""" start="00:02:35.061" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and I'll connect to your buffer.""" start="00:02:39.521" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: OK. Great.""" start="00:02:41.201" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I do that.""" start="00:02:42.561" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So what I do, I'm using Doom Emacs.""" start="00:02:43.281" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And I can use the `SPC` and then the `l`""" start="00:02:46.181" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""for the live share/collab part.""" start="00:02:49.308" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I can use the `s` for share current buffer.""" start="00:02:52.141" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So when I do this, I'm getting asked for some settings.""" start="00:02:58.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm going with the default settings here.""" start="00:03:01.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So default port, no password, and my display name.""" start="00:03:04.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And now Emacs is connecting.""" start="00:03:08.341" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And once it's connected, which just takes a couple of seconds,""" start="00:03:11.941" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I can get the URL.""" start="00:03:15.180" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm going back to this menu and using `y`""" start="00:03:17.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""for copying the URL of the current session.""" start="00:03:21.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And this is the URL I'm going to send over to you, Jonathan,""" start="00:03:24.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""to pick that up.""" start="00:03:27.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Right.""" start="00:03:29.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""OK.""" start="00:03:29.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And now on my screen, I'm going to do a `SPC l c` for connect.""" start="00:03:30.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And I'm going to paste the URL""" start="00:03:37.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that Lukas just sent me in here.""" start="00:03:38.741" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Default port, no password.""" start="00:03:40.980" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And we're connecting now.""" start="00:03:43.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this takes a second just to get us synced up.""" start="00:03:45.700" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we can work on the same document at the same time.""" start="00:03:51.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We can follow each other's cursors around.""" start="00:03:54.161" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We can have multiple buffers open and work on them""" start="00:03:56.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""at the same time.""" start="00:03:58.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And so here you see that we are both in the same document.""" start="00:04:01.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""You can see my cursor popping around.""" start="00:04:04.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And you can see we're both editing the same item.""" start="00:04:09.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Great.""" start="00:04:13.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: So we also see who else is currently in our buffer""" start="00:04:14.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""with the user overview.""" start="00:04:18.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So let me just delete that window.""" start="00:04:20.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And that's going to work in our main one.""" start="00:04:23.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we said first part is about data retrieval.""" start="00:04:26.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we should give it a headline.""" start="00:04:29.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We said prepare stage.""" start="00:04:37.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So what are we going to do first, Jonathan?""" start="00:04:39.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: So what we're going to do,""" start="00:04:42.320" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""what this whole document is based upon,""" start="00:04:43.941" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is we're going to pull data from Wikidata using a SPARQL query.""" start="00:04:45.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""The data we're going to pull is related to the NFDIs,""" start="00:04:50.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which here in Germany is the National Forschungsdaten""" start="00:04:53.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Infrastructure, which is a sort of collection of universities""" start="00:04:55.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that work together on various research projects.""" start="00:05:00.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And this is emblematic of the kind of data""" start="00:05:03.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we would be interested in working with here.""" start="00:05:05.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm going to paste a--forgive the pre-written code--""" start="00:05:09.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm going to paste some text in here.""" start="00:05:13.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: And while you are talking, I just""" start="00:05:20.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""keep on documenting what we do""" start="00:05:21.408" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""so we can split the work.""" start="00:05:23.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: In here, after a minor technical upset,""" start="00:05:27.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is the raw dataset cell.""" start="00:05:29.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And it's going to use SPARQL,""" start="00:05:32.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which is how we have the syntax highlighting""" start="00:05:34.741" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""in our code here.""" start="00:05:37.175" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's going to go to the URL endpoint""" start="00:05:37.941" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""query.wikidata.org/sparql ,""" start="00:05:40.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and it's going to return the data as a text CSV,""" start="00:05:43.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and it's going to cache that data""" start="00:05:46.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""so that we don't constantly hammer the API every time""" start="00:05:49.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we run this notebook.""" start="00:05:51.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm going to run that there.""" start="00:05:54.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""You can see down at the bottom of my screen,""" start="00:05:57.361" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we're contacting the host query.wikidata.org .""" start="00:05:58.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: And there's the result.""" start="00:06:05.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Yeah, except I think that for our purposes here,""" start="00:06:07.320" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we're just going to limit this to 50 results.""" start="00:06:11.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Oh, yeah.""" start="00:06:15.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Just so it's a little easier for us to manage.""" start="00:06:16.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm going to run that again.""" start="00:06:18.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""There we go.""" start="00:06:20.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""That looks a little better.""" start="00:06:21.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: I think that's fine.""" start="00:06:22.320" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""50 items is fine.""" start="00:06:23.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So what do we see here, Jonathan?""" start="00:06:25.360" video="mainVideo-collab" id="subtitle"]]
+[[!template new="1" text="""[Jonathan]: Right.""" start="00:06:27.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So the first thing we see when we look at this""" start="00:06:28.320" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is a couple of Q codes at the top,""" start="00:06:31.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which are an artifact of Wikidata.""" start="00:06:33.308" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So these are pages which don't have""" start="00:06:36.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""the label for whichever institution they happen to be.""" start="00:06:39.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""For our purposes here, we're just going to exclude them.""" start="00:06:42.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We could just go on Wikidata and edit them ourselves.""" start="00:06:45.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But for now, it's a little more interesting""" start="00:06:48.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""if we go and remove them.""" start="00:06:50.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm going to create a new cell.""" start="00:06:52.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Lukas, if you don't mind starting one for data cleaning.""" start="00:06:55.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Oh, yeah.""" start="00:06:58.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Good point.""" start="00:06:58.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Yeah, data cleaning.""" start="00:06:59.480" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""OK.""" start="00:07:02.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""How do you want to do that, Jonathan?""" start="00:07:03.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: I'm going to use a shell command.""" start="00:07:05.500" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So let's see.""" start="00:07:09.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""There we go.""" start="00:07:11.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And so you can see, here is another cell,""" start="00:07:13.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that the cell is now using a shell,""" start="00:07:15.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and that we have this thing `:var input=raw-dataset`,""" start="00:07:20.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which is the name of the cell above""" start="00:07:23.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""where we got our data from Wikidata.""" start="00:07:25.841" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""This is going to run just a simple shell command.""" start="00:07:28.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's going to take the input and then run `sed` on it""" start="00:07:31.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and exclude any records which have a Q""" start="00:07:33.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""followed by one or more digits afterwards.""" start="00:07:37.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""That should remove those from our data set.""" start="00:07:41.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm going to run that.""" start="00:07:44.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""That seems to have done the trick.""" start="00:07:48.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Great, yeah.""" start="00:07:51.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""That's really good.""" start="00:07:51.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We got rid of all the Q items.""" start="00:07:52.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Very good.""" start="00:07:55.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we just have two-column table: institutions""" start="00:07:55.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and consortia.""" start="00:07:59.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Very nice.""" start="00:08:02.760" video="mainVideo-collab" id="subtitle"]]
+[[!template new="1" text="""So let's come to our main part, doing some processing.""" start="00:08:04.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Let me give you a headline here, process the data.""" start="00:08:08.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""What do you want to do first?""" start="00:08:13.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: This is not a very complicated data set,""" start="00:08:15.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""but let's just do some simple counts first.""" start="00:08:17.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm going to start with Python,""" start="00:08:19.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and we're just going to do some aggregation with Python.""" start="00:08:22.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Again, I've got some pre-written code here.""" start="00:08:25.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""You can see that we've started a cell using Python.""" start="00:08:30.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""The variable `clean_df` now is equal to `clean-dataset`.""" start="00:08:35.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we're going to take that data""" start="00:08:37.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we retrieved from the SPARQL query,""" start="00:08:39.708" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we're going to run it through the cleaning cell,""" start="00:08:41.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and then we're going to import it into this cell.""" start="00:08:42.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""This is just going to do some simple Python aggregation.""" start="00:08:45.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We're going to import `pandas`,""" start="00:08:47.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which is the Python data science library,""" start="00:08:49.008" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""create a data frame out of our input,""" start="00:08:51.308" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and then aggregate it, grouping on `wLabel`,""" start="00:08:54.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and getting a count from that and returning it.""" start="00:08:57.480" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So if we execute that cell...""" start="00:08:59.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Nice, we get institutions and a count.""" start="00:09:05.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But what about not ordering it by the alphabet,""" start="00:09:08.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""but more like ordering by counts?""" start="00:09:14.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Sure.""" start="00:09:17.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So let's do this... `sort_values()`, I think, as the Python.""" start="00:09:18.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""How does that look?""" start="00:09:22.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Better, but I would like to""" start="00:09:24.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""have the highest number first""" start="00:09:27.641" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and then ascending.""" start="00:09:29.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Well, not ascending, descending.""" start="00:09:32.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Right, so we can do `ascending=False`.""" start="00:09:34.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: This is perfect, I'd say.""" start="00:09:39.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Great.""" start="00:09:42.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Very good.""" start="00:09:43.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""OK, that's nice.""" start="00:09:44.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We get a good overview here.""" start="00:09:46.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But can we also do something else,""" start="00:09:48.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""like counting how many institutions are""" start="00:09:50.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""involved in one consortium?""" start="00:09:56.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And also using this later on in the text?""" start="00:09:57.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Sure, so I'm going to put a new...""" start="00:10:00.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""If you give me another heading down here""" start="00:10:00.881" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""for institutions per consortium...""" start="00:10:05.041" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And here we're going to use awk code just to spice things up""" start="00:10:12.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and add yet another language in here.""" start="00:10:16.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So you can see this is awk.""" start="00:10:18.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We're using standard in instead of defining a variable.""" start="00:10:22.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But the really interesting thing about this cell""" start="00:10:26.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is that we have this `:var consortium=&quot;NFDI4Memory&quot;`.""" start="00:10:28.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And what this code is doing is""" start="00:10:33.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""it's counting any time it sees""" start="00:10:35.641" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that particular consortium name""" start="00:10:38.041" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and keeping track of that.""" start="00:10:40.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So if we execute this,""" start="00:10:41.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Lukas, why don't you execute this one?""" start="00:10:43.908" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: OK, I'm going to enter it.""" start="00:10:45.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And I get a result, NFDI4Memory,""" start="00:10:49.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""because this is our default value for this variable.""" start="00:10:52.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And we get the count.""" start="00:10:58.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So it's five institutions are involved""" start="00:10:59.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""in the NFDI4memory consortium.""" start="00:11:01.641" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Great, but the very nice thing, what I think,""" start="00:11:04.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is here that we can use this code snippet within our text.""" start="00:11:07.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So, blended in seamlessly.""" start="00:11:12.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Let me give you an example.""" start="00:11:14.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm writing out the text.""" start="00:11:16.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Now we know how many institutions are in...""" start="00:11:18.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Give me an example.""" start="00:11:27.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I would like to know how many institutions are""" start="00:11:29.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""involved in NFDI4Objects, which is a consortium.""" start="00:11:31.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm writing `call_` and using""" start="00:11:35.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""the name of this snippet here, of this cell,""" start="00:11:39.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which is `inst-count(`,""" start="00:11:42.608" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and writing my value, `NFDI4Objects`.""" start="00:11:46.608" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""As soon as I evaluate this using `C-c C-c`,""" start="00:11:51.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I get the result back here.""" start="00:11:58.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I can do this even for more.""" start="00:12:00.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Or in writing, `call_inst-count`, go with `NFDI4Earth`,""" start="00:12:05.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which is another consortium.""" start="00:12:14.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""`C-c C-c`, it's three institutions.""" start="00:12:16.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""This can be used throughout your text,""" start="00:12:20.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and as soon as the data set changes from in the beginning,""" start="00:12:23.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""maybe different results requiring Wikidata,""" start="00:12:26.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""this also will be updated once it's exported.""" start="00:12:30.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Very nice, Jonathan.""" start="00:12:35.080" video="mainVideo-collab" id="subtitle"]]
+[[!template new="1" text="""But I think we did a lot of analysis""" start="00:12:36.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""on text and counting things.""" start="00:12:38.975" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Can we also do something more visual?""" start="00:12:41.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Show me something.""" start="00:12:43.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Sure.""" start="00:12:45.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So what we can do with this, because we just""" start="00:12:45.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""have two columns here that are sort of related,""" start="00:12:48.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we can build a little network plot out of it.""" start="00:12:51.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So let's make a network visualization.""" start="00:12:53.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We're going to use the `igraph` library from R""" start="00:12:57.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and just plot the edges that we see here.""" start="00:12:59.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""There we go.""" start="00:13:02.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""There's my little heading and space.""" start="00:13:04.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Here is our code.""" start="00:13:11.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Again, just to be fancy and keep using""" start="00:13:13.480" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""different languages in here, we set a variable called""" start="00:13:16.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""`NFDI_edges` equal to `clean-dataset`.""" start="00:13:19.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this, again, is sort of cascading""" start="00:13:21.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""through the original data""" start="00:13:23.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we pulled from the Wikidata endpoint,""" start="00:13:25.741" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""cleaning that data, and now it's being inserted""" start="00:13:28.808" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""into this cell as well.""" start="00:13:30.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But you see the difference here.""" start="00:13:32.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Instead of exporting a table, what we're saying""" start="00:13:34.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is that there will be a graphics file,""" start="00:13:36.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and it will be called network-plot.png.""" start="00:13:39.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""All right.""" start="00:13:44.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And so Lukas, why don't you execute this one?""" start="00:13:45.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: There you go.""" start="00:13:47.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I can click `C-c C-c`""" start="00:13:48.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and I get a nice plot of the network below our cell.""" start="00:13:52.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is very nice indeed.""" start="00:13:59.160" video="mainVideo-collab" id="subtitle"]]
+[[!template new="1" text="""So I think it's about time to wrap it up and to export""" start="00:14:01.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and to preserve the data and the documentation""" start="00:14:05.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we have in our very last step, calling preserve.""" start="00:14:07.960" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I would like to do it in two steps.""" start="00:14:13.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""First, maybe manually exporting it,""" start="00:14:16.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""but then also doing it in a batch process.""" start="00:14:18.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Giving you some insights how to do that manual export.""" start="00:14:22.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""For example, you can do a LaTeX export.""" start="00:14:27.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Let me write down the key combination to do that here.""" start="00:14:30.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So you press `SPC m e l o`.""" start="00:14:34.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Let me show you how this is done.""" start="00:14:44.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm pressing `SPC`.""" start="00:14:49.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm pressing `m`, which is my local leader.""" start="00:14:51.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I'm pressing `e`, which is now the `org-export-dispatch`.""" start="00:14:55.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And now I have different options I can choose from.""" start="00:15:01.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I want to do a LaTeX export because I want to get in PDF.""" start="00:15:03.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm pressing `l`.""" start="00:15:07.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Now I've got different options available.""" start="00:15:08.675" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So I'm pressing `o` for a PDF file and open that.""" start="00:15:11.480" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Let's see now the code.""" start="00:15:17.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Now this is exporting document.""" start="00:15:21.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And what we have here is PDF,""" start="00:15:25.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which contains our workflow in the beginning,""" start="00:15:29.675" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""our bullet points we have here,""" start="00:15:31.975" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and also the code snippet""" start="00:15:35.708" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we use for querying the data.""" start="00:15:37.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And we have the result below that.""" start="00:15:41.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is our table with all the data sets.""" start="00:15:43.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But as you can see, this is running out of the page.""" start="00:15:47.000" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is not very nice using the default settings.""" start="00:15:51.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""But everything is in this PDF.""" start="00:15:55.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""I guess we can now show you a way""" start="00:16:00.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""how to improve this result.""" start="00:16:02.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Right.""" start="00:16:06.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we have, of course, a version of this""" start="00:16:07.040" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we prepared ahead of time,""" start="00:16:09.400" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which is more or less identical to the one we just made,""" start="00:16:10.775" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""but it has a little more text, a little more explanation,""" start="00:16:14.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""a little more documentation along with the code.""" start="00:16:17.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""You can see we have some metadata up at the top,""" start="00:16:20.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""the title, the authors, a bibliography,""" start="00:16:23.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and most importantly, the `custom-export.setup` file,""" start="00:16:26.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which lists specifically the sort of LaTeX commands""" start="00:16:31.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we're using and the HTML styles that we're going to use.""" start="00:16:36.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And then down at the bottom of this file,""" start="00:16:43.600" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""we have our automatic batch process.""" start="00:16:45.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Here is one more language we're including in here.""" start="00:16:49.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is Lisp.""" start="00:16:51.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And you can see here we are exporting to HTML, ASCII,""" start="00:16:53.440" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and PDF.""" start="00:16:57.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""The nice thing about this is that this is a document.""" start="00:16:58.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's a sort of document that we have a couple of""" start="00:17:01.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""that we can have running automatically and building.""" start="00:17:03.308" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It will export a HTML, an ASCII file, and a PDF file""" start="00:17:08.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""every time it's run based off of""" start="00:17:12.920" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""the most recent data available on Wikidata.""" start="00:17:14.675" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So it's self-documenting.""" start="00:17:17.320" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We have, of course, our data retrieval steps,""" start="00:17:19.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""our data cleaning steps, our data preparation steps,""" start="00:17:22.441" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and our preservation steps all listed at the same time.""" start="00:17:25.160" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And then you can see over on the right,""" start="00:17:28.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""there's an example of the HTML file that we get out of this.""" start="00:17:30.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We also get a very nicely formatted PDF file,""" start="00:17:34.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""which doesn't have that little issue""" start="00:17:37.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""with the overflow of the table.""" start="00:17:39.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's very nicely put together.""" start="00:17:41.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And we even have an ASCII file.""" start="00:17:43.560" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""And I should also point out very quickly,""" start="00:17:46.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""while you have this one up, Lukas, after the awk code,""" start="00:17:47.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""you can see the text for the number of consortia,""" start="00:17:51.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""or the number of institutions per consortia""" start="00:17:56.080" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""is actually printed inline.""" start="00:17:57.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: Yeah, you're very right.""" start="00:18:00.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So this is what we had as code,""" start="00:18:01.800" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""and now this is nicely integrated into our text.""" start="00:18:06.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So we got the consortium and number of institutions.""" start="00:18:10.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""You can't tell a difference between code and text.""" start="00:18:15.280" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: And those are automatically updated.""" start="00:18:19.200" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""So if another institution joins NFDI4Earth,""" start="00:18:20.720" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""then the next time this runs, we update the text right here.""" start="00:18:23.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It's nothing we have to worry about.""" start="00:18:26.320" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""We just pull it directly out of Wikidata.""" start="00:18:28.520" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Lukas]: And for the sake of completeness,""" start="00:18:31.840" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""this is the ASCII file.""" start="00:18:34.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""That's in the export format.""" start="00:18:37.880" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""It contains also everything, code and data.""" start="00:18:42.760" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Yeah, so this is what we wanted to show you,""" start="00:18:48.360" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""how to do some data processing,""" start="00:18:53.240" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""some collaborative work,""" start="00:18:56.640" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""documenting using org-babel.""" start="00:18:58.680" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""Thanks for listening.""" start="00:19:01.120" video="mainVideo-collab" id="subtitle"]]
+[[!template text="""[Jonathan]: Thank you all, have a good day.""" start="00:19:05.720" video="mainVideo-collab" id="subtitle"]]
+
+
+
+Captioner: amine
+
Questions or comments? Please e-mail [hartman@itc.rwth-aachen.de, bossert@itc.rwth-aachen.de](mailto:hartman@itc.rwth-aachen.de, bossert@itc.rwth-aachen.de?subject=Comment%20for%20EmacsConf%202022%20collab%3A%20Collaborative%20data%20processing%20and%20documenting%20using%20org-babel)
diff --git a/2023/info/collab-before.md b/2023/info/collab-before.md
index 08e80937..75284619 100644
--- a/2023/info/collab-before.md
+++ b/2023/info/collab-before.md
@@ -8,12 +8,21 @@ The following image shows where the talk is in the schedule for Sat 2023-12-02.
Format: 20-min talk; Q&A: ask questions via Etherpad/IRC; we'll e-mail the speaker and post answers on this wiki page after the conference
Etherpad: <https://pad.emacsconf.org/2023-collab>
Discuss on IRC: [#emacsconf-gen](https://chat.emacsconf.org/?join=emacsconf,emacsconf-gen)
-Status: Ready to stream
+Status: Now playing on the conference livestream
<div>Times in different timezones:</div><div class="times" start="2023-12-02T18:50:00Z" end="2023-12-02T19:10:00Z"><div class="conf-time">Saturday, Dec 2 2023, ~1:50 PM - 2:10 PM EST (US/Eastern)</div><div class="others"><div>which is the same as:</div>Saturday, Dec 2 2023, ~12:50 PM - 1:10 PM CST (US/Central)<br />Saturday, Dec 2 2023, ~11:50 AM - 12:10 PM MST (US/Mountain)<br />Saturday, Dec 2 2023, ~10:50 AM - 11:10 AM PST (US/Pacific)<br />Saturday, Dec 2 2023, ~6:50 PM - 7:10 PM UTC <br />Saturday, Dec 2 2023, ~7:50 PM - 8:10 PM CET (Europe/Paris)<br />Saturday, Dec 2 2023, ~8:50 PM - 9:10 PM EET (Europe/Athens)<br />Sunday, Dec 3 2023, ~12:20 AM - 12:40 AM IST (Asia/Kolkata)<br />Sunday, Dec 3 2023, ~2:50 AM - 3:10 AM +08 (Asia/Singapore)<br />Sunday, Dec 3 2023, ~3:50 AM - 4:10 AM JST (Asia/Tokyo)</div></div><div><a href="/2023/watch/gen/">Find out how to watch and participate</a></div>
+<div class="vid"><video controls preload="none" id="collab-mainVideo"><source src="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.webm" />captions="""<track label="English" kind="captions" srclang="en" src="/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.vtt" default />"""<track kind="chapters" label="Chapters" src="/2023/captions/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main--chapters.vtt" /><p><em>Your browser does not support the video tag. Please download the video instead.</em></p></video>[[!template id="chapters" vidid="collab-mainVideo" data="""
+00:00.000 Introduction
+01:16.080 Org Mode
+02:18.960 Working together
+06:27.840 Data cleaning
+08:04.040 Processing
+12:36.040 Visualization
+14:01.760 Preserve
+"""]]<div></div>Duration: 19:16 minutes<div class="files resources"><ul><li><a href="https://pad.emacsconf.org/2023-collab">Open Etherpad</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--final.webm">Download --final.webm (62MB)</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--intro.vtt">Download --intro.vtt</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--intro.webm">Download --intro.webm</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main--chapters.vtt">Download --main--chapters.vtt</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.opus">Download --main.opus (11MB)</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.txt">Download --main.txt</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.vtt">Download --main.vtt</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--main.webm">Download --main.webm (62MB)</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--normalized.opus">Download --normalized.opus (17MB)</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--original.mp4">Download --original.mp4 (460MB)</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--reencoded.webm">Download --reencoded.webm (56MB)</a></li><li><a href="https://media.emacsconf.org/2023/emacsconf-2023-collab--collaborative-data-processing-and-documenting-using-orgbabel--jonathan-hartman-lukas-c-bossert--room-noise.webm">Download --room-noise.webm</a></li><li><a href="https://toobnix.org/w/7AAwoawr5MXNSrqiHJQoak">View on Toobnix</a></li></ul></div></div>
# Description
<!-- End of emacsconf-publish-before-page --> \ No newline at end of file