[[!meta title="Collaborative data processing and documenting using org-babel"]]
[[!meta copyright="Copyright © 2023 Jonathan Hartman, Lukas C. Bossert"]]
[[!inline pages="internal(2023/info/collab-nav)" raw="yes"]]
<!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
<!-- You can manually edit this file to update the abstract, add links, etc. --->
# Collaborative data processing and documenting using org-babel
Jonathan Hartman (he/him), Lukas C. Bossert (he/him) - <https://mastodon.social/@lukascbossert>, <mailto:hartman@itc.rwth-aachen.de>, <mailto:bossert@itc.rwth-aachen.de>
[[!inline pages="internal(2023/info/collab-before)" raw="yes"]]
In our presentation we will show an efficient way of combining
information and enriching it by retrieving data, processing it, and
finally exporting it, all with org-mode. In this presentation, we will
demonstrate not only org-mode, but also a few companion libraries that
add functionality such as knowledge graph visualizations, literate
programming, and collaborative editing to quickly create a deeply
informative reference page.
The starting point of our best practice is the National Research Data
Infrastructure Germany (NFDI), about which we intend to retrieve and
process certain information data gathered from wikidata. For this, we
are additionally leveraging the "org-roam" emacs package, which
provides functionality for quickly and simply linking together notes
and ideas into a custom knowledge graph. Initially, we will write a
short abstract about the NFDI and embed it into our existing knowledge
graph by linking it to other existing nodes. In the visualized graph
(using the “org-roam-ui” package), links and secondary connections to
other existing nodes can now be revealed.
Next, we would like to enrich the text about the NFDI by with data
retrieved from the Wikidata API. A convenient way of creating
self-documenting code is the approach called “literate programming”,
which presents program logic embedded within human language text. In
Emacs we achieve this by using the “org-babel” package. Perhaps now we
find it is helpful to collaborate with a colleague in the document:
while one is writing the code, the other can explain its use and
interpret the results. We will do this simultaneously in the same
document using a method called “crdt” (conflict-free replicated data
type) and – of course – there is also an implementation of this in
Emacs. The results of the code blocks can be used for further analysis
and shared throughout the same document.
Finally, for the sake of proper and barrier free documentation, we
show how to export the document to various formats like pdf, html, txt
etc. using either the built-in feature of org-mode or the
implementation of pandoc.
About the speakers:
**Jonathan Hartman** is a trained data scientist and works at the IT
Center of the RWTH Aachen University, Germany.
**Lukas C. Bossert** is a trained classical archaeologist and is deputy
head of the department "research process and data management" at the
IT Center of the RWTH.
Lukas, an intermediate Emacs user, is currently exploring how to
optimize his daily workflow by leveraging various Emacs packages. On
the other hand, Jonathan is a relative newcomer to this environment,
encountering common pitfalls faced by beginners. Together, they
explore the capabilities and functionalities of org-mode, discovering
how it can enhance data management and presentation in their research
processes.
[[!inline pages="internal(2023/info/collab-after)" raw="yes"]]
[[!inline pages="internal(2023/info/collab-nav)" raw="yes"]]