2022/talks/sqlite.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176

[[!sidebar content=""]]
[[!meta title="Using SQLite as a data source: a framework and an example"]]
[[!meta copyright="Copyright &copy; 2022 Andrew Hyatt"]]
[[!inline pages="internal(2022/info/sqlite-nav)" raw="yes"]]

<!-- Initially generated with emacsconf-generate-talk-page and then left alone for manual editing -->
<!-- You can manually edit this file to update the abstract, add links, etc. --->


# Using SQLite as a data source: a framework and an example
Andrew Hyatt (he/him)

[[!inline pages="internal(2022/info/sqlite-before)" raw="yes"]]
[[!template id="help"
volunteer=""
summary="Q&A could be indexed with chapter markers"
tags="help_with_chapter_markers"
message="""The Q&A session for this talk does not have chapter markers yet.
Would you like to help? See [[help_with_chapter_markers]] for more details. You can use the vidid="sqlite-qanda" if adding the markers to this wiki page, or e-mail your chapter notes to <emacsconf-submit@gnu.org>."""]]


Emacs can now be built with SQLite, giving native support for reading
and writing to a database. With this, we can start seriously
considering a SQLite-first approach: instead of storing data on the
filesystem, and using various ad-hoc solutions for metadata, we can
use SQLite to store and search our data. This is essentially a
tradeoff between the power and speed of SQLite and the universality of
the filesystem. If we accept that this approach is useful, then a
standard way to store information in database, may be useful and
promote package interoperability, just as our single filesystem does.
The triples packages is a RDF-like database for supplying such a
flexible system for storing and retrieving data from SQLite. A sample
application, ekg, a replacement for org-roam, is shown using this, and
the advantages of the triple design are explained.

For more information and the packages discussed here, see the
[triples](https://github.com/ahyatt/triples) and
[ekg](https://github.com/ahyatt/ekg) pages.

# Discussion

- <https://www.reddit.com/r/emacs/comments/zexv6b/using_sqlite_as_a_data_source_a_framework_and_an/>
- <https://news.ycombinator.com/item?id=33853509>

## Notes

-   <https://github.com/ahyatt/triples>
-   <https://github.com/ahyatt/ekg>
-   <https://www.gnu.org/software/hyperbole/man/hyperbole.html#Implicit-Buttons> -
    discussed in the next talk could really help simplify access to your
    triples and ekg primitives; have a look when you have time.
- You had some delicious recipes in there. Made ME hungry! 
- Thanks, that's insightful  

## Questions and answers

-   Q:  To what extent did Datomic influence the design of triples? 
    <https://www.datomic.com/>
    -   A: I wasn't aware of Datomics, but I think both triples and
        Datomics are influenced by RDF & trends in Knowledge Graph
        construction in the industry.  I took a look at that page, and
        one interesting thing they do is (AFAICT) store edits to the
        database instead of actual values, allowing you to reconstruct
        the database or go forward or backward in time.  This is very
        interesting, and I'm thinking of ways to make the database
        safer, but not sure if I can get to such sophisticated
        properties in this implementation. 
-   Q:built into Emacs? nice. multiple schemas (so to say
    differentiation of "databases" (if you will so) are possible with
    that built-in instance?
    -   A: Yes, with emacs 29. Full-featured with multiple databases,
        transactions, etc.
-   Q:What about collaborative editing whith this? Multiple computers
    with multiple emacs like crdt.el with org mode?
    -   A: Database are great for more async collaboration, multiple
        people / processes can add to the repository at the same time. 
        But I think it's not going to be a great solution for multiple
        users modifying  the same buffer.
-   Q:What about using this on multiple computers? How would you
    syncronize the data?  (This is a minor problem in org-roam where if
    you share files between multiple computers, their SQLite databases
    get out of sync and require M-x org-roam-db-sync to rebuild the
    SQLite database.)
    -   A: This is an unsolved problem, one that I'm interested in
        looking into.  There possibly are standardized db solutions to
        this, but I don't yet know what they are.
-   Q: With EKG what about views like org roam node mind map view? Or
    org mode virtual view for integration with other org packages?
    -   A: This is possible, it just needs to interface with the
        database in a different way.  It's all graphs, so really any
        triple library might be a good fit for this.
-   Q:  Are you planning to further develop EKG? It is highly
    interesting to me, I do prefer SQLite over text.
    -   A: Andrew is using it; not ready for general use but quite soon
        -- towards the end of  the month! Still in exploratory mode
        though. Still thinking about (some of) the fundamental
        concepts. 
-   Q: Is it then possible to combine the triples DB with some custom
    tables in the same SQLite file? (e.g. to build a log table next to
    the triples tables for quick query of event data)
    -   A: You could do that. AT the moment it's just one table
        (triples). It's designed to be one table in one DB -- beware
        of consistency issues if you add further tables, which you can
        definitely do.
-   Q: What are your thoughts on adding a timestamp attribute to triples
    so that the DB becomes append-only and by default you return the
    latest fact for a subject/object pair?
    -   Q+ -> Use is to keep a record - you don´t delete? e.g. you get
        all past addresses of a person or all past versions of a given
        fact. Even version control for notes.
    -   A: I haven't thought of that / not seen in other triple stores
    -   A: Be ware that these DBs already take quite a bit of space
    -   A: may make synchronization easier
-   Q: can ordinary lisp data types (lists, symbols, etc) be stored in
    the data base
    -   A: Yes, if you don't specify, it defaults to a list. Not sure
        that was the right design choice; lists map to rows with a
        hidden index column. emacs-sql and this also represent most
        things as strings. 
-   Q: beyond note taking what kind of packages do you think would
    benefit from triples library?
    -   A: Anything where you have lisp forms stored in a file would
        probably be better implemented via a database.  And the triples
        library makes this easy and standardized.  So, for example BBDB,
        which is a "database" should actually be in a database.  And
        then you might want to annotate, tag, etc, so having be in one
        big database makes a lot of sense to me, because I think all
        this kind of info wants to live together. 
-   Q: Are you trying to create a PIM with EKG?  What information do you
    primarily want to manage?
    -   A: Yes, I think many uses of emacs is in line with PIM, and
        those use-cases are a good fit for triples / ekg.  Notes,
        people, projects, maybe even being able to integrate with org
        and manage TODOs there as well.
-   Q: What about using other databases programs Postgres mongoDB etc..
    [see last q too, but I guess you refer to other relational, e.g.
    Postgresql]
    -   A: Those could work, maybe the triples library would work via
        emacsql with those, but I haven't tested it.  I'm not sure
        what the benefit would be, typically these database tend to be
        simple and small, so a more full-featured DB is probably
        overkill.  MongoDB and those kinds of row-oriented databases
        probably wouldn't be a good fit, but I haven't tried it.
-   Q: What is your preferred reference to understand triples/graph dbs?
    (e.g. think better about schema design)
    -   A: I know from using them / talking about them. I will come back
        with some references!
-   Q: Will it slow down with a growth of database?
    -   A: there is a tradeoff -- triples gives you a standard schema,
        but you lose the power of getting things in one SQL expression -
        which makes it slow. But I have a bunch of data (2yrs of
        org-roam usage) and it is still very fast. These limits exist,
        but the usual rate of content creation is not large enough to
        hit the limits.
-   Q: What are your thoughts on allowing for a "true" graph-db
    backend? (whatever the current best free software alternative to
    neo4j is, I guess). 
    -   A:  In my usage, the graph DBs tended to be slow and somewhat
        clumsy in usage, we returned every time to SQL [please reword
        if appropriate]. At the moment not a glaring need (for the
        quantities of data people manage in Emacs).
-   Q: How hungry did you get while writing and recording this?
    -   A: I forgot that I used recipes as an example in my demo! 
        org-roam / ekg and other things are a great way to cook better,
        BTW.  When you make a recipe, write a org-roam daily (or in ekg,
        an entry tagged with the date and the recipe) with notes about
        how it went, what could be better, etc.  Then you can later see
        the recipe and notes on it at once, which helps you further
        refine the recipe.


[[!inline pages="internal(2022/info/sqlite-after)" raw="yes"]]

[[!inline pages="internal(2022/info/sqlite-nav)" raw="yes"]]

[[!taglink CategoryEmacsLisp]]