[[!meta title="Unlocking linked data: replacing specialized apps with an Org-based semantic wiki"]] [[!meta copyright="Copyright © 2024 Abhinav Tushar"]] [[!inline pages="internal(2024/info/links-nav)" raw="yes"]] # Unlocking linked data: replacing specialized apps with an Org-based semantic wiki Abhinav Tushar (he/him) - abhinav@lepisma.xyz, https://lepisma.xyz, @lepisma@mathstodon.xyz [[!inline pages="internal(2024/info/links-before)" raw="yes"]] I try to maintain a lot of personal information, annotations, etc. in Org files but have historically switched back to purpose built apps for different kinds of data. There are recipe managers for recipes, personal CRM tools for people related notes, bookmark managers for managing web links, etc. While these apps do good with the kind of data they work on, they don't operate well together in the sense that they don't treat *links* between entities as first class citizen. I believe this gap is where a lot of *personal information* live. As an example, consider the chain of links that tells 'person a' gave me 'this recipe' on 'my anniversary'. After using zettlekasten via Org-roam for some time, I came to realize the power of links that we (can) form between data of different kinds. For me, these links offset the loss that comes with leaving specialized apps. With this, I have again gone back to Org files, but this time deriving good value from links between notes. Of course there are tons of other benefits of using Org files like better longevity, portability, versioning, and developer accessibility. In this talk, I will cover my workflow of creating and managing different kinds of notes in Org mode based Semantic Wiki and the link types they tend to have. I will also show my workflow outside of Emacs, where I use small tools that sit on top of Org files to deliver missing features of niche apps (like availability on mobile devices, smart cross data-type queries, etc.). About the speaker: I am a Programmer and Machine Learning Engineer, and I love working with computers primary because of the early experiences of infinite extensibility that Emacs gave me. For this talk I will cover my journey of using Org files for notes, then leaving for specialized applications, and finally coming back to Org to unlock the benefits of linked data. --- Another talk by this speaker: - [EmacsConf - 2023 - talks - MatplotLLM, iterative natural language data visualization in org-babel](https://emacsconf.org/2023/talks/matplotllm/) # Discussion ## Questions and answers - Q: Have you thought about doing the cosine similarity and sentence transformer calculations in Elisp so you don't need a separate Python process?  In my experience having to set up and manage additional state throws people off track. - A: I do want to try removing the dependency. But I haven't yet done any work in that direction. Mostly the problem is that model (for transformers) runtimes are much easier available in other languages. But if there is an ONNX runtime (or dynamic module) for Elisp, we should be able to do this. - Thanks, I can try writing an ONNX runtime module, this can be useful for several Emacs tasks besides semantic linking. - Q: So far I have not used packages such as org-roam because I do not like the idea that it might become unmaintained some day. So I keep to the basic features in org for my workflow. Did you consider this aspect? - A: I thought about this too. But I have found the internals of org-roam simple enough that I don't think maintaining a fork is any hassle. Anyway it uses features already available in org-mode. The only development addition it does is, IMO, to maintain an SQLite index. - Thank you for your advice. I'll take another look at org-roam. And thank you for your talk. It was quite inspiring to me. - Q: this is very cool and seems a bit influenced by logseq, which i am trying to transition away from and on to org roam. have you looked into somehow embedding the contents of a \"linked\" node into the parent itself? this is something that i miss quite a lot from logseq, where the contents were/could be transparently embedded and made for a nicer review experience - A: I haven't used logseq. When you say embedding, do you mean like document transclusion? Or something else? - yes, something like transclusion. quite useful for example in daily journalling where one can just dump the notes instead of figuring out a location. and then link them afterwards in the right file/node. - In some way, the org-roam buffer I showed shows linked nodes with nearby content. But I haven't done any work on transclusion till now. - This may be relevant to your question [https://github.com/Vidianos-Giannitsis/Dotfiles/blob/master/emacs/.emacs.d/libs/zettelkasten.org#logseq-like-tagging-functionality](https://github.com/Vidianos-Giannitsis/Dotfiles/blob/master/emacs/.emacs.d/libs/zettelkasten.org#logseq-like-tagging-functionality). I don't remember exactly what it does because I don't use it myself, but I was curious to try and hack it after a discussion and it was relevant to how Logseq does transclusion in linked documents. - ooh, thanks for the link. this looks rather interesting :) - Q: How did you do the similarity search? - A: Similarity, as of now, is just using embedding vectors from a locally running transformer model and then matching using cosine scores. Code is here [https://github.com/lepisma/org-roam-exts/tree/master/org-roam-sem](https://github.com/lepisma/org-roam-exts/tree/master/org-roam-sem) - Q: Is your ml model for topics like \"family members\" available somewhere? - A: [https://github.com/lepisma/org-roam-exts/tree/master/org-roam-sem](https://github.com/lepisma/org-roam-exts/tree/master/org-roam-sem) the model I am using is a simple lightweight embedding transforme model. See this line [https://github.com/lepisma/org-roam-exts/blob/a71f2ec3bb6bd9d2b21ab5fd70ec45fa18128896/org-roam-sem/src/org_roam_sem/featurize.py#L17C7-L17C77](https://github.com/lepisma/org-roam-exts/blob/a71f2ec3bb6bd9d2b21ab5fd70ec45fa18128896/org-roam-sem/src/org_roam_sem/featurize.py#L17C7-L17C77) - Q: is your org-roam config public? (init.el stuff) I've found vanilla org-mode not the most ergonomic. Thanks! - A: Do you mean [https://github.com/lepisma/org-roam-exts](https://github.com/lepisma/org-roam-exts) - Also some of my writing config is here -\> [https://github.com/lepisma/rogue/blob/master/lisp/r-writing.el](https://github.com/lepisma/rogue/blob/master/lisp/r-writing.el) ## Notes - This looks very useful, thanks for your work - Looks really handy! One of the biggest inhibitors to my usage has been figuring out how to collect things on mobile without friction. Will check it out!+1 - Thank you all! - A few project links from the talk: - [https://github.com/lepisma/org-roam-exts](https://github.com/lepisma/org-roam-exts) - [https://github.com/lepisma/pile-android](https://github.com/lepisma/pile-android) [[!inline pages="internal(2024/info/links-after)" raw="yes"]] [[!inline pages="internal(2024/info/links-nav)" raw="yes"]]