[[!sidebar content=""]] [[!meta title="Using SQLite as a data source: a framework and an example"]] [[!meta copyright="Copyright © 2022 Andrew Hyatt"]] [[!inline pages="internal(2022/info/sqlite-nav)" raw="yes"]] # Using SQLite as a data source: a framework and an example Andrew Hyatt (he/him) [[!inline pages="internal(2022/info/sqlite-before)" raw="yes"]] Emacs can now be built with SQLite, giving native support for reading and writing to a database. With this, we can start seriously considering a SQLite-first approach: instead of storing data on the filesystem, and using various ad-hoc solutions for metadata, we can use SQLite to store and search our data. This is essentially a tradeoff between the power and speed of SQLite and the universality of the filesystem. If we accept that this approach is useful, then a standard way to store information in database, may be useful and promote package interoperability, just as our single filesystem does. The triples packages is a RDF-like database for supplying such a flexible system for storing and retrieving data from SQLite. A sample application, ekg, a replacement for org-roam, is shown using this, and the advantages of the triple design are explained. For more information and the packages discussed here, see the [triples](https://github.com/ahyatt/triples) and [ekg](https://github.com/ahyatt/ekg) pages. # Discussion ## Notes - - - - discussed in the next talk could really help simplify access to your triples and ekg primitives; have a look when you have time. ## Questions and answers - Q:  To what extent did Datomic influence the design of triples?  - A: I wasn\'t aware of Datomics, but I think both triples and Datomics are influenced by RDF & trends in Knowledge Graph construction in the industry.  I took a look at that page, and one interesting thing they do is (AFAICT) store edits to the database instead of actual values, allowing you to reconstruct the database or go forward or backward in time.  This is very interesting, and I\'m thinking of ways to make the database safer, but not sure if I can get to such sophisticated properties in this implementation.  - Q:built into Emacs? nice. multiple schemas (so to say differentiation of \"databases\" (if you will so) are possible with that built-in instance? - A: Yes, with emacs 29. Full-featured with multiple databases, transactions, etc. - Q:What about collaborative editing whith this? Multiple computers with multiple emacs like crdt.el with org mode? - A: Database are great for more async collaboration, multiple people / processes can add to the repository at the same time.  But I think it\'s not going to be a great solution for multiple users modifying  the same buffer. - Q:What about using this on multiple computers? How would you syncronize the data?  (This is a minor problem in org-roam where if you share files between multiple computers, their SQLite databases get out of sync and require M-x org-roam-db-sync to rebuild the SQLite database.) - A: This is an unsolved problem, one that I\'m interested in looking into.  There possibly are standardized db solutions to this, but I don\'t yet know what they are. - Q: With EKG what about views like org roam node mind map view? Or org mode virtual view for integration with other org packages? - A: This is possible, it just needs to interface with the database in a different way.  It\'s all graphs, so really any triple library might be a good fit for this. - Q:  Are you planning to further develop EKG? It is highly interesting to me, I do prefer SQLite over text. - A: Andrew is using it; not ready for general use but quite soon \-- towards the end of  the month! Still in exploratory mode though. Still thinking about (some of) the fundamental concepts.  - Q: Is it then possible to combine the triples DB with some custom tables in the same SQLite file? (e.g. to build a log table next to the triples tables for quick query of event data) - A: You could do that. AT the moment it\'s just one table (triples). It\'s designed to be one table in one DB \-- beware of consistency issues if you add further tables, which you can definitely do. - Q: What are your thoughts on adding a timestamp attribute to triples so that the DB becomes append-only and by default you return the latest fact for a subject/object pair? - Q+ -\> Use is to keep a record - you don´t delete? e.g. you get all past addresses of a person or all past versions of a given fact. Even version control for notes. - A: I haven\'t thought of that / not seen in other triple stores - A: Be ware that these DBs already take quite a bit of space - A: may make synchronization easier - Q: can ordinary lisp data types (lists, symbols, etc) be stored in the data base - A: Yes, if you don\'t specify, it defaults to a list. Not sure that was the right design choice; lists map to rows with a hidden index column. emacs-sql and this also represent most things as strings.  - Q: beyond note taking what kind of packages do you think would benefit from triples library? - A: Anything where you have lisp forms stored in a file would probably be better implemented via a database.  And the triples library makes this easy and standardized.  So, for example BBDB, which is a \"database\" should actually be in a database.  And then you might want to annotate, tag, etc, so having be in one big database makes a lot of sense to me, because I think all this kind of info wants to live together.  - Q: Are you trying to create a PIM with EKG?  What information do you primarily want to manage? - A: Yes, I think many uses of emacs is in line with PIM, and those use-cases are a good fit for triples / ekg.  Notes, people, projects, maybe even being able to integrate with org and manage TODOs there as well. - Q: What about using other databases programs Postgres mongoDB etc.. \[see last q too, but I guess you refer to other relational, e.g. Postgresql\] - A: Those could work, maybe the triples library would work via emacsql with those, but I haven\'t tested it.  I\'m not sure what the benefit would be, typically these database tend to be simple and small, so a more full-featured DB is probably overkill.  MongoDB and those kinds of row-oriented databases probably wouldn\'t be a good fit, but I haven\'t tried it. - Q: What is your preferred reference to understand triples/graph dbs? (e.g. think better about schema design) - A: I know from using them / talking about them. I will come back with some references! - Q: Will it slow down with a growth of database? - A: there is a tradeoff \-- triples gives you a standard schema, but you lose the power of getting things in one SQL expression - which makes it slow. But I have a bunch of data (2yrs of org-roam usage) and it is still very fast. These limits exist, but the usual rate of content creation is not large enough to hit the limits. - Q: What are your thoughts on allowing for a \"true\" graph-db backend? (whatever the current best free software alternative to neo4j is, I guess).  - A:  In my usage, the graph DBs tended to be slow and somewhat clumsy in usage, we returned every time to SQL \[please reword if appropriate\]. At the moment not a glaring need (for the quantities of data people manage in Emacs). - Q: How hungry did you get while writing and recording this? - A: I forgot that I used recipes as an example in my demo!  org-roam / ekg and other things are a great way to cook better, BTW.  When you make a recipe, write a org-roam daily (or in ekg, an entry tagged with the date and the recipe) with notes about how it went, what could be better, etc.  Then you can later see the recipe and notes on it at once, which helps you further refine the recipe. [[!inline pages="internal(2022/info/sqlite-after)" raw="yes"]] [[!inline pages="internal(2022/info/sqlite-nav)" raw="yes"]] [[!taglink CategoryEmacsLisp]]