summaryrefslogblamecommitdiffstats
path: root/2023/talks/matplotllm.md
blob: 784c9463550963881eed07db9259fa78d2f8cbdd (plain) (tree)
































                                                                                                                         






                                                                      
 

















































                                                                                                                                                                                                                                                                                               





                                                                  
[[!meta title="MatplotLLM, iterative natural language data visualization in org-babel"]]
[[!meta copyright="Copyright © 2023 Abhinav Tushar"]]
[[!inline pages="internal(2023/info/matplotllm-nav)" raw="yes"]]

<!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
<!-- You can manually edit this file to update the abstract, add links, etc. --->


# MatplotLLM, iterative natural language data visualization in org-babel
Abhinav Tushar (he/him) - abhinav@lepisma.xyz, https://lepisma.xyz, @lepisma@mathstodon.xyz, <mailto:abhinav@lepisma.xyz>

[[!inline pages="internal(2023/info/matplotllm-before)" raw="yes"]]

Large Language Models (LLMs) have improved in capabilities to an extent
where a lot of manual workflows can be automated by just providing
natural language instructions.

On such manual work is to create custom visualizations. I have found the
process to be really tedious if you want to make something non-standard
with common tools like matplotlib or d3. These frameworks provide low
level abstractions that you can then use to make your own
visualizations.

Earlier to make a new custom visualization, I would open two windows in
Emacs, one for code, other for the generated image. In this talk, I will
show how a powerful LLM could lead to a much more natural interface
where I only need to work with text instructions and feedback on the
currently generated plot. The system isn't perfect, but it shows us how
the future or such work could look like.

The package is called MatplotLLM and lives here
<https://github.com/lepisma/matplotllm>

About the speaker:

I am a Programmer and Machine Learning Engineer who has been in love
with Emacs' extendability from the moment I pressed M-x. Since then, I
have been doing as many things inside Emacs as I can. In this talk, I
will cover a recent attempt at automating one of my workflows inside
Emacs.

# Discussion

## Questions and answers

-   Q: What is the license of <https://github.com/lepisma/matplotllm>
    project ? Sjo
    -   A: GPLv3 or later. Sorry, I didn\'t put this in the repository,
        You can refer to
        <https://github.com/lepisma/matplotllm/blob/main/matplotllm.el#L18C12-L29>
        though.
-   Q: Sometimes LLMs hallucinate. Can we trust the graph that it
    produces?
    -   A: Not always, but the chances of hallucinations impacting
        \'generated code\' that causes a harmful but not identifiable
        hallucinations are a little lower. Usually hallucination in code
        show up as very visible bug so you can always do a retry. But I
        haven\'t done a thorough analysis here yet.
- Q: What are your thoughts on the carbon footprint of LLM useage?
  - (not the speaker): to add a bit more to power usage of LLMs, it is not inherent that the models must take many megawatts to train and run. work is happening and seems promising to decrease power usage
## Notes

-   Repository link <https://github.com/lepisma/matplotllm> . A
    connected blog post here
    <https://lepisma.xyz/2023/08/20/matplotllm:-an-llm-assisted-data-visualization-framework/index.html>
- gptel is another package doing a good job is flexible configuration and choice over LLM/API
- I came across this adapter to run multiple LLM's, apache 2.0 license too! https://github.com/predibase/lorax
- It will turn out the escape-hatch for AGI will be someone's integration of LLMs into their Emacs and enabling M-x control.
- i don't know what question to ask but i found presentation extremely useful thank you
- I think we are close to getting semantic search down for our own files
 - yeah, khoj uses embeddings to search Org, I think
	   - I tried it a couple of times, latest about a month ago. The search was quite bad unfortunately
	   - did you try the GPT version or just the PyTorch version?
		       - just the local ones. For GPT I used a couple of other packages to embed in OpenAI APIs. But I am too shy to send all my notes :D
		   - Same for me. But I really suspect that GPT will be way better. They now also support LLama, which is hopeful
	   - I keep meaning to revisit the idea of the Remembrance Agent and see if it can be updated for these times (and maybe local HuggingFace embeddings)
- I think Andrew is right that Emacs is uniquely positioned, being a unified integrated interface with good universal abstractions (buffers, text manipulation, etc), and across all uses cases and notably one's Org data. Should be interesting...!
- Speaking of which, anyone trained/fined-tuned/prompted a model with their Org data yet and applied it to interesting use cases (planning/scheduling, etc) and care to comment?
- The ubiquitous integration of LLMs (multi-modal) for anything and everything in/across Emacs and Org is both 1) exciting, 2) scary.
- I could definitely use semantic search across all of my stored notes. Can't remember what words I used to capture things.
- Indeed. A "working group" / "birds of a feather" type of thing around the potential usages and integration of LLMs and other models into Emacs and Org-mode would be interesting, especially as this is what pulls people into other platforms these days.
- To that end, Andrew is right that we'll want to abstract it into the right abstractions and interfaces. And not just LLMs by vendor/models, but what comes after LLMs/GPTs in terms of approach.
- I lean toward thinking that LLMs may have some value but to me a potentially wrong result is worse than no result
  - I think it would depend on the use case. A quasi-instant first approximation that can readily be fixed/tweaked can be quite useful in some contexts.
- not to mention the "summarization" use cases (for papers, and even across papers I've found, like a summarization across abstracts/contents of a multiplicity of papers and publications around a topic or in a field - weeks of grunt work saved, not to mention of procrastination avoided)
      - IMHO summarization is exactly where LLMs can't be useful because they can't be trusted to be accurate
- <https://dindi.garjola.net/ai-assistants.html>; A friend wrote this <https://www.jordiinglada.net/sblog/llm.html>; < https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot-copyright-commitment-ai-legal-concerns/>
- I have a feeling this is one of the 'em "if you can't beat them join them" scenario. I don't see that ending with a bit global rollback due to such issues anytime soon...
- (discussion about LLMs, copyright, privacy)



[[!inline pages="internal(2023/info/matplotllm-after)" raw="yes"]]

[[!inline pages="internal(2023/info/matplotllm-nav)" raw="yes"]]