1 files changed, 195 insertions, 0 deletions
diff --git a/2023/talks/collab.md b/2023/talks/collab.md
new file mode 100644
index 00000000..1a88809b
--- /dev/null
+++ b/2023/talks/collab.md
@@ -0,0 +1,195 @@
+[[!meta title="Collaborative data processing and documenting using org-babel"]]
+[[!meta copyright="Copyright &copy; 2023 Jonathan Hartman, Lukas C. Bossert"]]
+[[!inline pages="internal(2023/info/collab-nav)" raw="yes"]]
+
+<!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
+<!-- You can manually edit this file to update the abstract, add links, etc. --->
+
+
+# Collaborative data processing and documenting using org-babel
+Jonathan Hartman (he/him), Lukas C. Bossert (he/him) - <https://mastodon.social/@lukascbossert>, <mailto:hartman@itc.rwth-aachen.de>, <mailto:bossert@itc.rwth-aachen.de>
+
+[[!inline pages="internal(2023/info/collab-before)" raw="yes"]]
+
+In our presentation we will show an efficient way of combining
+information and enriching it by retrieving data, processing it, and
+finally exporting it, all with org-mode. In this presentation, we will
+demonstrate not only org-mode, but also a few companion libraries that
+add functionality such as knowledge graph visualizations, literate
+programming, and collaborative editing to quickly create a deeply
+informative reference page.
+
+The starting point of our best practice is the National Research Data
+Infrastructure Germany (NFDI), about which we intend to retrieve and
+process certain information data gathered from wikidata. For this, we
+are additionally leveraging the "org-roam" emacs package, which
+provides functionality for quickly and simply linking together notes
+and ideas into a custom knowledge graph. Initially, we will write a
+short abstract about the NFDI and embed it into our existing knowledge
+graph by linking it to other existing nodes. In the visualized graph
+(using the “org-roam-ui” package), links and secondary connections to
+other existing nodes can now be revealed.
+
+Next, we would like to enrich the text about the NFDI by with data
+retrieved from the Wikidata API. A convenient way of creating
+self-documenting code is the approach called “literate programming”,
+which presents program logic embedded within human language text. In
+Emacs we achieve this by using the “org-babel” package. Perhaps now we
+find it is helpful to collaborate with a colleague in the document:
+while one is writing the code, the other can explain its use and
+interpret the results. We will do this simultaneously in the same
+document using a method called “crdt” (conflict-free replicated data
+type) and – of course – there is also an implementation of this in
+Emacs. The results of the code blocks can be used for further analysis
+and shared throughout the same document.
+
+Finally, for the sake of proper and barrier free documentation, we
+show how to export the document to various formats like pdf, html, txt
+etc. using either the built-in feature of org-mode or the
+implementation of pandoc.
+
+About the speakers:
+
+**Jonathan Hartman** is a trained data scientist and works at the IT 
+Center of the RWTH Aachen University, Germany.
+
+**Lukas C. Bossert** is a trained classical archaeologist and is deputy
+head of the department "research process and data management" at the
+IT Center of the RWTH.
+
+Lukas, an intermediate Emacs user, is currently exploring how to
+optimize his daily workflow by leveraging various Emacs packages. On
+the other hand, Jonathan is a relative newcomer to this environment,
+encountering common pitfalls faced by beginners. Together, they
+explore the capabilities and functionalities of org-mode, discovering
+how it can enhance data management and presentation in their research
+processes.
+
+[[!img /i/emacsconf-2023-collab-sponsorship.png alt="Lukas and Jonathan are financed by the DKZ.2R Datenkompetenzkolleg Rhein-Ruhr (16DKZ2030E), www.dks2r.de"]]
+
+# Discussion
+
+## Questions and answers
+
+-   Q: How reliable it resolves the conflict? I mean, for my personal
+    use case, for example, Sycnthing, sometimes it's not working
+    perfectly and I had to manually edit it. How is it robust compared
+    to syncthing?
+    -   A (Lukas): We  also faced sometimes issues that letters got
+        mixed up. We couldnt figure out what caused it and it was not
+        reproducable . I cannot compare it to syncthing, never used that
+        with emacs/org-mode.
+-   Q: How's the security for this kind of things? I mean, if we adopt
+    these things in our PAD, is there any, can this thing execute
+    arbitrary (elisp) code in different people's computer? (Think like
+    an adversary!)
+    -   A: (Lukas)  As far as we saw the code is executed on the local
+        computer, see the part with the R-code in our video. 
+    -   (zaeph) We had plans with qhong (maintainer of crdt.el) to
+        tunnel the connection via SSL, but we were blocked by the SSL
+        library that shipped with Emacs, sadly.  However, we did create
+        a security policy that allowed restrictions on the execution of
+        Elisp code. (great!)
+-   Q: Really nice talk and demo!  You guys clearly rehearsed :).  I
+    always wonder with serial data processing sequencing like this, to
+    what degree do the intermediate outputs need to appear inline in the
+    text?  Suppose you had 50,000 or one million rows from your initial
+    wikidata (or similar) call.  How would you handle that size of data
+    using a collaborative, literate approach like this?
+    -   A: (Lukas) Good question. In your local buffer there is no
+        difference and for the collaborative partner I cannot tell. We
+        testet it with 50 items because that was enough for
+        demonstrating our purpose.
+    -   noweb allows getting results of evaluation without having to put
+        the actual data into Org buffer - just arrange the original
+        block generating the data to have :results silent. Basically,
+        :var foo=block-name does not require "block-name" to be
+        evaluated in advance - it will be evaluated as necessary. AFAIU,
+        in the talk, it is re-evaluated every time (to not have it, one
+        would need :cache t).
+        -   This has tremendous utility
+    -   So it would be stored on disk and referenced by name in a
+        subsequent block?  Sounds useful.  
+        -   Not on disk - just cached within a single session. To store
+            on disk, need to save to actual file on disk.
+-   Q: How do you handle the viewing of larger or really any tabular
+    data in Emacs/Org when you want to inspect it, like the nice way
+    tabular data is displayed inline in Rmarkdown/RStudio?
+    -   A: (Lukas) I have no particular way of doing this. 
+    -   What about pandas data summary functionality? Can be a simple
+        python block.
+    -   Lukas: Jonathan is our python expert, he might answer this
+        question.
+    -   A: (Jonathan) If I follow, you can certainly just use
+        DataFrame.describe() or Series.describe() to get summary
+        statistics for a dataset - the return value would be a Series or
+        a DataFrame, which would be displayed similiarly to how we show
+        things here. Alternatively, DataFrame.head(n) or
+        DataFrame.sample(n) would return a dataframe of the first n / n
+        random lines of a dataset, and might be a way of providing the
+        gist of a very large dataset without printing the entire table
+        in the document.
+    -   Would be nice to have a "summarized table" functionality in
+        Org, that includes an abridged copy of a long table inline, but
+        you can open it in another buffer to browse/edit the full table
+        (ala block edit).  
+        -   Feel free to post a feature request - see
+            <https://orgmode.org/manual/Feedback.html#Feedback>
+-   Q: I'm thinking about an application for a single user, but in
+    different platforms. In a simple case. For example, you have a
+    buffer in your local computer, and you also want to have some files
+    on your pad or on your phone, and you can use this CADT concept to
+    make sure that there's not too much conflict in between different
+    editing sections. Do you think this is a good idea? I mean, compared
+    to purely relying on Syncthing, which sometimes I feel is unreliable
+    for resolving those conflicts.
+    -   A: (Lukas) This sounds very interesting and could beneficial for
+        contiously working on things.
+
+## Notes
+
+-   I like the way you highlight the point you are talking about in real
+    time.
+-   Conflict-free Replicated Data Types (CADT) ::
+    <https://github.com/emacs-straight/crdt>
+-   !This is the future of PAD for our conference.
+-   Just came here to say watching two users editing the same buffer
+    simultaneously is BLOWING MY MIND 
+    -   BLOWING MY MIND  +2
+    -   blowing my mind, too ...
+    -   WOW
+-   Gitlab custom-export.setup
+    -   What about it?
+        -   I am looking for that setup file and want to try it :) 
+            -->
+            <https://git.rwth-aachen.de/dl/workshops/collaborative-coding-with-emacs/-/blob/main/emacs/custom-export.setup>
+            -   Thank you!
+-   Truly one of the most impressive talks of the day. Congrats! Very
+    inspiring
+    -   Yes, indeed. 
+    -   (Lukas) Wow! Thank you. We werent sure if this is worth showing
+        at EmacsConf because there already have been plenty of talks
+        about literate programming and org-babel....
+        -   Great collaborative conversation and step-wise example
+            creates a different (and impactful) framing.  Thank you!
+- crdt is fantastic; pity that most (all but one) of my collaborators use Word & VS Code. 🙁
+- that's really cool.  One of the parts that's a bit hidden from the user is seeing the format that the data is in inside the shell script
+- it is whatever constitutes the closest equivalent of table in sh (array)
+	  - yeah, you have to keep the representation in mind when filtering it as text through sed
+- this demo is so cool :D
+- Really, really impressive I have to admit
+- HA. you cannot evaluate in place so seamlessly in that way with Rmarkdown :). And you cannot combine named blocks in this way either. Wish more folks used emacs.
+- wow, so `#+CALL` can be embedded in text via `call_()?` TIL
+- such a slick presentation, I like the CRDT collaboration angle, looks like an end-game UX
+- Impressive workflow!
+- great presentation!
+- For those of you who remember the bad old days before "reproducible research," that talk is even more impressive. Great job!
+  - i was prolly not there in the bad old days, but imho reproducible research is a pressing, current problem.
+- I feel like that talk video should be shared on Hacker News
+
+
+[[!inline pages="internal(2023/info/collab-after)" raw="yes"]]
+
+[[!inline pages="internal(2023/info/collab-nav)" raw="yes"]]
+
+