diff options
Diffstat (limited to '2022/talks/sqlite.md')
-rw-r--r-- | 2022/talks/sqlite.md | 175 |
1 files changed, 175 insertions, 0 deletions
diff --git a/2022/talks/sqlite.md b/2022/talks/sqlite.md new file mode 100644 index 00000000..89146b51 --- /dev/null +++ b/2022/talks/sqlite.md @@ -0,0 +1,175 @@ +[[!sidebar content=""]] +[[!meta title="Using SQLite as a data source: a framework and an example"]] +[[!meta copyright="Copyright © 2022 Andrew Hyatt"]] +[[!inline pages="internal(2022/info/sqlite-nav)" raw="yes"]] + +<!-- Initially generated with emacsconf-generate-talk-page and then left alone for manual editing --> +<!-- You can manually edit this file to update the abstract, add links, etc. ---> + + +# Using SQLite as a data source: a framework and an example +Andrew Hyatt (he/him) + +[[!inline pages="internal(2022/info/sqlite-before)" raw="yes"]] + +Emacs can now be built with SQLite, giving native support for reading +and writing to a database. With this, we can start seriously +considering a SQLite-first approach: instead of storing data on the +filesystem, and using various ad-hoc solutions for metadata, we can +use SQLite to store and search our data. This is essentially a +tradeoff between the power and speed of SQLite and the universality of +the filesystem. If we accept that this approach is useful, then a +standard way to store information in database, may be useful and +promote package interoperability, just as our single filesystem does. +The triples packages is a RDF-like database for supplying such a +flexible system for storing and retrieving data from SQLite. A sample +application, ekg, a replacement for org-roam, is shown using this, and +the advantages of the triple design are explained. + +For more information and the packages discussed here, see the +[triples](https://github.com/ahyatt/triples) and +[ekg](https://github.com/ahyatt/ekg) pages. + +# Discussion + +- <https://www.reddit.com/r/emacs/comments/zexv6b/using_sqlite_as_a_data_source_a_framework_and_an/> +- <https://news.ycombinator.com/item?id=33853509> + +## Notes + +- <https://github.com/ahyatt/triples> +- <https://github.com/ahyatt/ekg> +- <https://www.gnu.org/software/hyperbole/man/hyperbole.html#Implicit-Buttons> - + discussed in the next talk could really help simplify access to your + triples and ekg primitives; have a look when you have time. +- You had some delicious recipes in there. Made ME hungry! +- Thanks, that's insightful + +## Questions and answers + +- Q: To what extent did Datomic influence the design of triples? + <https://www.datomic.com/> + - A: I wasn't aware of Datomics, but I think both triples and + Datomics are influenced by RDF & trends in Knowledge Graph + construction in the industry. I took a look at that page, and + one interesting thing they do is (AFAICT) store edits to the + database instead of actual values, allowing you to reconstruct + the database or go forward or backward in time. This is very + interesting, and I'm thinking of ways to make the database + safer, but not sure if I can get to such sophisticated + properties in this implementation. +- Q:built into Emacs? nice. multiple schemas (so to say + differentiation of "databases" (if you will so) are possible with + that built-in instance? + - A: Yes, with emacs 29. Full-featured with multiple databases, + transactions, etc. +- Q:What about collaborative editing whith this? Multiple computers + with multiple emacs like crdt.el with org mode? + - A: Database are great for more async collaboration, multiple + people / processes can add to the repository at the same time. + But I think it's not going to be a great solution for multiple + users modifying the same buffer. +- Q:What about using this on multiple computers? How would you + syncronize the data? (This is a minor problem in org-roam where if + you share files between multiple computers, their SQLite databases + get out of sync and require M-x org-roam-db-sync to rebuild the + SQLite database.) + - A: This is an unsolved problem, one that I'm interested in + looking into. There possibly are standardized db solutions to + this, but I don't yet know what they are. +- Q: With EKG what about views like org roam node mind map view? Or + org mode virtual view for integration with other org packages? + - A: This is possible, it just needs to interface with the + database in a different way. It's all graphs, so really any + triple library might be a good fit for this. +- Q: Are you planning to further develop EKG? It is highly + interesting to me, I do prefer SQLite over text. + - A: Andrew is using it; not ready for general use but quite soon + -- towards the end of the month! Still in exploratory mode + though. Still thinking about (some of) the fundamental + concepts. +- Q: Is it then possible to combine the triples DB with some custom + tables in the same SQLite file? (e.g. to build a log table next to + the triples tables for quick query of event data) + - A: You could do that. AT the moment it's just one table + (triples). It's designed to be one table in one DB -- beware + of consistency issues if you add further tables, which you can + definitely do. +- Q: What are your thoughts on adding a timestamp attribute to triples + so that the DB becomes append-only and by default you return the + latest fact for a subject/object pair? + - Q+ -> Use is to keep a record - you don´t delete? e.g. you get + all past addresses of a person or all past versions of a given + fact. Even version control for notes. + - A: I haven't thought of that / not seen in other triple stores + - A: Be ware that these DBs already take quite a bit of space + - A: may make synchronization easier +- Q: can ordinary lisp data types (lists, symbols, etc) be stored in + the data base + - A: Yes, if you don't specify, it defaults to a list. Not sure + that was the right design choice; lists map to rows with a + hidden index column. emacs-sql and this also represent most + things as strings. +- Q: beyond note taking what kind of packages do you think would + benefit from triples library? + - A: Anything where you have lisp forms stored in a file would + probably be better implemented via a database. And the triples + library makes this easy and standardized. So, for example BBDB, + which is a "database" should actually be in a database. And + then you might want to annotate, tag, etc, so having be in one + big database makes a lot of sense to me, because I think all + this kind of info wants to live together. +- Q: Are you trying to create a PIM with EKG? What information do you + primarily want to manage? + - A: Yes, I think many uses of emacs is in line with PIM, and + those use-cases are a good fit for triples / ekg. Notes, + people, projects, maybe even being able to integrate with org + and manage TODOs there as well. +- Q: What about using other databases programs Postgres mongoDB etc.. + [see last q too, but I guess you refer to other relational, e.g. + Postgresql] + - A: Those could work, maybe the triples library would work via + emacsql with those, but I haven't tested it. I'm not sure + what the benefit would be, typically these database tend to be + simple and small, so a more full-featured DB is probably + overkill. MongoDB and those kinds of row-oriented databases + probably wouldn't be a good fit, but I haven't tried it. +- Q: What is your preferred reference to understand triples/graph dbs? + (e.g. think better about schema design) + - A: I know from using them / talking about them. I will come back + with some references! +- Q: Will it slow down with a growth of database? + - A: there is a tradeoff -- triples gives you a standard schema, + but you lose the power of getting things in one SQL expression - + which makes it slow. But I have a bunch of data (2yrs of + org-roam usage) and it is still very fast. These limits exist, + but the usual rate of content creation is not large enough to + hit the limits. +- Q: What are your thoughts on allowing for a "true" graph-db + backend? (whatever the current best free software alternative to + neo4j is, I guess). + - A: In my usage, the graph DBs tended to be slow and somewhat + clumsy in usage, we returned every time to SQL [please reword + if appropriate]. At the moment not a glaring need (for the + quantities of data people manage in Emacs). +- Q: How hungry did you get while writing and recording this? + - A: I forgot that I used recipes as an example in my demo! + org-roam / ekg and other things are a great way to cook better, + BTW. When you make a recipe, write a org-roam daily (or in ekg, + an entry tagged with the date and the recipe) with notes about + how it went, what could be better, etc. Then you can later see + the recipe and notes on it at once, which helps you further + refine the recipe. +- Q: beyond note taking what kind of packages do you think would benefit from triples library + +Other discussions from IRC: + +- I like the font he's using in his org doc. +- In some way, triples (turtle, RDF and similar things) describe a directed graph, where echa edge is like: Subject--Predicate-->Object. A datalog can describe this too (for example: with predicates like triple(S,P,O)). + + +[[!inline pages="internal(2022/info/sqlite-after)" raw="yes"]] + +[[!inline pages="internal(2022/info/sqlite-nav)" raw="yes"]] + +[[!taglink CategoryEmacsLisp]] |