1 files changed, 118 insertions, 0 deletions
diff --git a/2023/talks/llm.md b/2023/talks/llm.md
new file mode 100644
index 00000000..64966f28
--- /dev/null
+++ b/2023/talks/llm.md
@@ -0,0 +1,118 @@
+[[!meta title="LLM clients in Emacs, functionality and standardization"]]
+[[!meta copyright="Copyright &copy; 2023 Andrew Hyatt"]]
+[[!inline pages="internal(2023/info/llm-nav)" raw="yes"]]
+
+<!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
+<!-- You can manually edit this file to update the abstract, add links, etc. --->
+
+
+# LLM clients in Emacs, functionality and standardization
+Andrew Hyatt (he/him) - <ahyatt@gmail.com> - <https://urbanists.social/@ahyatt> - <http://github.com/ahyatt>
+
+[[!inline pages="internal(2023/info/llm-before)" raw="yes"]]
+
+As an already powerful way to handle a variety of textual tasks, Emacs
+seems unique well poised to take advantage of Large Language Models
+(LLMs).  We'll go over what LLMs are and are used for, followed by listing
+the significant LLM client packages already for Emacs.  That functionality
+that these packages provide can be broken down into the basic features that
+are provided through these packages.  However, each package currently is
+managing things in an uncoordinated way.  Each might support different LLM
+providers, or perhaps local LLMs.  Those LLMs support different
+functionality.  Some packages directly connect to the LLM APIs, others use
+popular non-Emacs packages for doing so.  The LLMs themselves are evolving
+rapidly.  There is both a need to have some standardization so users don't
+have to configure their API keys or other setup independently for each
+package, but also a risk that any standardization will be premature.  We
+show what has been done in the area of standardization so far, and what
+should happen in the future.
+
+About the speaker:
+
+Andrew Hyatt has contributed the Emacs websocket package, the triples
+(making a triple-based DB library) and the ekg package (a tag-based
+note-taking application).  He has been using various other LLM
+integrations, and ss part of extending ekg, he's been working on his own.
+# Discussion
+
+## Questions and answers
+
+-   Q: What is your use case for Embedding? Mainly for searching? 
+    -   A:
+        -   I got you. It's kinda expand our memory capcity. 
+-   Q: What do you think about "Embed Emacs manual" VS "GPTs  Emacs
+    manual?
+    -   A: 
+        -   yes GPTS actually how it's kind of embedding your document
+            into its memory and then using the logic that provided by
+            GPT-4 or other versions. I never tried that one but I'm
+            just wondering if you have ever tried the difference
+-   Q: When deferring commit messages to an LLM, what (if anything) do
+    you find you have lost?
+    -   A:
+-   Q: Can you share your font settings in your emacs config? :) (Yeah,
+    those are some nice fonts for reading)
+    -   A: I think it was Menlo, but I've sinced changed it (I'm
+        experimenting with Monaspace
+-   Q: In terms of standardisation, do you see a need for a
+    medium-to-large scale effort needed?
+    -   A:
+        -   I mean, as a user case, the interface is quite simple
+            because we're just providing an API to a server. I'm not
+            sure what standardization we are really looking at. I mean,
+            it's more like the how we use those callback from the llm.
+-   Q: What are your thoughts on the carbon footprint of LLM useage?
+    -   A:
+-   Q: LLMs are slow in responding. Do you think Emacs should provide
+    more async primitives to keep it responsive? E.g. url-retrieve is
+    quite bad at building API clients with it.
+    -   A:
+        -   Gptel.el is async. And very good at tracking the point. 
+-   Q: Speaking of which, anyone trained/fined-tuned/prompted a model
+    with their Org data yet and applied it to interesting use cases
+    (planning/scheduling, etc) and care to comment?
+    -   A:
+        -   I use GPTS doing weekly review. I'm not purely rely on it.
+            It's help me to find something I never thought about and I
+            just using as alternateive way to do the reviewing.  I find
+            it's kind of interesting to do so.
+
+### Notes and discussion
+
+- gptel is another package doing a good job is flexible configuration and choice over LLM/API
+- I came across this adapter to run multiple LLM's, apache 2.0 license too! https://github.com/predibase/lorax
+- It will turn out the escape-hatch for AGI will be someone's integration of LLMs into their Emacs and enabling M-x control.
+- i don't know what question to ask but i found presentation extremely useful thank you
+- I think we are close to getting semantic search down for our own files
+ - yeah, khoj uses embeddings to search Org, I think
+	   - I tried it a couple of times, latest about a month ago. The search was quite bad unfortunately
+	   - did you try the GPT version or just the PyTorch version?
+		       - just the local ones. For GPT I used a couple of other packages to embed in OpenAI APIs. But I am too shy to send all my notes :D
+		   - Same for me. But I really suspect that GPT will be way better. They now also support LLama, which is hopeful
+	   - I keep meaning to revisit the idea of the Remembrance Agent and see if it can be updated for these times (and maybe local HuggingFace embeddings)
+- I think Andrew is right that Emacs is uniquely positioned, being a unified integrated interface with good universal abstractions (buffers, text manipulation, etc), and across all uses cases and notably one's Org data. Should be interesting...!
+- Speaking of which, anyone trained/fined-tuned/prompted a model with their Org data yet and applied it to interesting use cases (planning/scheduling, etc) and care to comment?
+- The ubiquitous integration of LLMs (multi-modal) for anything and everything in/across Emacs and Org is both 1) exciting, 2) scary.
+- I could definitely use semantic search across all of my stored notes. Can't remember what words I used to capture things.
+- Indeed. A "working group" / "birds of a feather" type of thing around the potential usages and integration of LLMs and other models into Emacs and Org-mode would be interesting, especially as this is what pulls people into other platforms these days.
+- To that end, Andrew is right that we'll want to abstract it into the right abstractions and interfaces. And not just LLMs by vendor/models, but what comes after LLMs/GPTs in terms of approach.
+- I lean toward thinking that LLMs may have some value but to me a potentially wrong result is worse than no result
+  - I think it would depend on the use case. A quasi-instant first approximation that can readily be fixed/tweaked can be quite useful in some contexts.
+- not to mention the "summarization" use cases (for papers, and even across papers I've found, like a summarization across abstracts/contents of a multiplicity of papers and publications around a topic or in a field - weeks of grunt work saved, not to mention of procrastination avoided)
+      - IMHO summarization is exactly where LLMs can't be useful because they can't be trusted to be accurate
+- <https://dindi.garjola.net/ai-assistants.html>; A friend wrote this <https://www.jordiinglada.net/sblog/llm.html>; < https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot-copyright-commitment-ai-legal-concerns/>
+- I have a feeling this is one of the 'em "if you can't beat them join them" scenario. I don't see that ending with a bit global rollback due to such issues anytime soon...
+- (discussion about LLMs, copyright, privacy)
+- I spent more time than I was hoping to setting up some custom Marginalia(s?) the other day, notably for cases where the "category" is dynamic, the annotation/affixation function varies, the candidates are an alist of key-value pairs and not just directly the value, and many little specificities like that. Idem for org-ql many moons back, org-agenda, etc. That sort of workflow always involves the same things: learning/reading, examples, trials, etc. I wonder if LLMs could be integrated at various points in that recurring exercise, to take just a sample case.
+- that's yet another great use case for LLMs : externalizing one's thinking for its own sake, if only to hear back the echo of one's "voice", and do so with an infinitely patient quasi-omniscient second party.
+  - oooh, might be a good one for blog post writing: generate some follow-up questions people might have
+  - Yeah, a "rubber duck" LLM could be very handy
+  - I'm sure there would be great demand for such a thing, to dry-run one's presentations (video or text) and generate anticipate questions and so on. Great take.
+  - I've seen some journaling prompts along those lines. I think it'll get even more interesting as the text-to-speech and speech-to-text parts get better. Considering how much people bonded with Eliza, might be interesting to see what people can do with a Socratic assistant...
+
+
+[[!inline pages="internal(2023/info/llm-after)" raw="yes"]]
+
+[[!inline pages="internal(2023/info/llm-nav)" raw="yes"]]
+
+