summaryrefslogtreecommitdiffstats
path: root/2021/talks/imaginary.md
blob: 8289537450b2e313acb99c16e1a734571d1b9898 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
[[!meta title="Imaginary Programming"]]
[[!meta copyright="Copyright © 2021 Shane Mulligan"]]
[[!inline pages="internal(2021/info/imaginary-nav)" raw="yes"]]

<!-- You can manually edit this file to update the abstract, add links, etc. --->


# Imaginary Programming
Shane Mulligan



[[!inline pages="internal(2021/info/imaginary-schedule)" raw="yes"]]

Imaginary Programming (IP) is both methodology and paradigm. It is an
extension of literate programming and a way of creating software without
the use of imperative, functional or even declarative code. Yet IP employs
all disciplines to achieve the miraculous. The only contingency is on one
or more language models, known as foundation models. The real value of IP
is not found by abandoning sound logic altogether, but in weaving the real
with the imaginary. The future of imaginary programming is one in which
almost all of computing is inferred. I have built a suite of tools based on
emacs for interfacing real programming languages with imaginary ones; all
of this in order to demonstrate what I mean; a ‘complex’ terminal that lets
you imagine what happens no matter how nested you are within interpreters,
an example-oriented language, a file format that encodes the provenance of
text and a library for imaginary functional programming primitives called
iLambda. It is important to recognise IP because, for lack of a better
term, it has far-reaching implications for intellectual property and the
GPL. Please keep an open mind.

# Discussion

Pad:

-   Q1: Do you have a site we can follow more of your writing on?
    -   A:Pen.el Tutorial: https://semiosis.github.io/posts/pen-el-tutorial/
    -   https://semiosis.github.io/posts/ilambda-tutorial/
    -   https://emacsconf.org/2021/talks/imaginary/
-   Q2:  re slide 27, would it mean that 2 such "idefined" functions would be the "same", meaning do the same thing the same way, given that they are defined without a "body"? (i'm trying to get a better grasp on the objects that get so "imagined" under the hood)
    -   A: The first time a function is run with given parameters, the results are remembered. I use the memoize library. You can update the function every time by surrounding the call the the function with the (upd ...) macro. The body evaluation is completely short-circuited with idefun. The imacro works a bit differently. It will generate real code. You can use the normal macro-expand on an imacro.
-   Q3:Opalvaults :What are some underlying concepts/papers, that we could read to become more familiar with your overarching ideas? (i.e. for instance things that inspired your ideas)
    -   A: paper: pretrain, prompt and predict
-   Q4: Sorry, I just don't get it: How is a function that does something different each time it's called useful?
    -   A: Each time you run one of these functions, you are getting the computer to imagine for you. It's a bicycle for the imagination. You can automate the filtration of the results you want, say by doing many generations and applying grep, or other prompts such as the semantic search prompt to the results. The functions are memoised, so they technicaly do the same thing every time if you want them to. Also, if you use a temperature of 0 for the prompt functions (I demonstrate how to override that, somewhere in the slides), it will be deterministic too, even when bypassing the cache.
-   Q5: How on earth do you ensure that what ilambda gets back from GPT-3 is Lisp and not, say, Harry Potter fanfic? :)
    -   A: A combination of good prompt design, filtering the results, and validating the results. Also, you can fine-tune models to the task you want to eliminate the possibility of unwanted generations.
-   Q6: Your views on the pluses and minuses of GPT-3?
    -   A:It's something we have to live with because of its transformative nature on computing. These language models unfortunately are license-blind.
-   Q7: Any interesting ideas about potential applications of GPT-3 to Emacs itself (or Emacs-adjacent things)?
    -   A: Emacs is the ultimate text-centric operating system. It will become a kernel for AGI, I think. That's what I plan on making. The power-user's terminal of human-ai interaction. I'm trying to extend as many modes in emacs as possible. Org-brain, eww browser, org-mode, comint, emacs lisp primitives, etc.
-   Q8: Follow-up on Q2: how does infering functions in this manner differ from, say, how in the Haskell ecosystem functions are infered by specifying inputs and return type (such as when searching for a suitable function for a given purpose)?
    -   A: Where in haskell, type-declarative function search look through a discrete set of functions by type, the domain of possible functions that are search for using language models is qualatively and quantatively infinite.
-   Q9: Are you deriving functions from their names? What do you do when this is ambiguous - for example, when the name of the function is "get-element-from-pair"?
    -   A: idefun will infer computation and short-circuit the code. Given either 'function name', alone, function name + args, or function name, + args + docstring, or function nae + args + docstring + function body, it will make use of the context you have provided and imagine evaluation. It will create functions which infer rather than properly evaluate, based on merely the name of the function, for example.
    -   A (re: ambiguity): If you had an imaginary defun for this, you'd need to send the final list 

IRC nick: libertyprime

BBB:

- libertyprime: What kinds of software is IP (imaginary programming) not suitable for?
  - libertyprime: Good question. IP is great for things like mocking API calls, because you can imagine the API call output. It's great for code generation where you can then do a macro-expand to generate code as you are programming. It's great for coming up with functions that might be difficult to write (idefun color-of-watermelon), for example
- Hey libertyprime, where do we follow up to find out more?
  - libertyprime: it's not really good for scaffolding code. I consider emacs to be 45 years of scaffolding to build imaginary functions around
  - libertyprime: Because IP needs a rigid complimentary code.
- So how does an IP user verify that the imagined code does what is intended?
- I like the word 'imaginary' to describe the paradigm
- libertyprime: How does an IP user verify that the imagined code does what is intended? Through a combination of 'validator functions', imaginary validation functions and language model fine-tuning.  So you may also choose an underlying language model to use when running code. That model may have been trained to do the task you are giving it. If you're trying out the docker container you can run `pen config` or do `M-x pen-customize` to force the language model, or chage it in the imagine-evaluating-emacs-lisp .prompt file
- libertyprime: Haha. The brilliance of emacs, and the reason this stuff is so easy to do with emacs, is that emacs provides intelligible modes and abstractions with which to build prompts. Otherwise you have an amorphous blob of a language model.
- libertyprime: So the value is absoltely not in replacing emacs entire, as I've come to understand it, but in combining real and imaginary.
- (wish i could give you back just a fraction of the time you saved just this one person here!)
- I would love to see the result of imaginary major modes and keymaps
- libertyprime, is the idea for the first draft of the gpt output to be final, or do you expect to edit some?
- There seems to be a lot of jargon in this context, like validators, prompts, language models, etc.  It's really hard for someone who doesn't already use these things to understand what these pieces are and how they fit together.
  - well prompts seem to be the input you give to the language model, which it then generates a follow up to
  - validators sounds like tests?  language models are neural language models like GPT-3/j etc.
  - libertyprime: <http://github.com/semiosis/glossaries-gh/blob/master/pe-prompt-engineering.txt>
<http://github.com/semiosis/glossaries-gh/blob/master/prompt-engineering.txt>
<http://github.com/semiosis/glossaries-gh/blob/master/pen.el.txt>
  - libertyprime: Here are some glossaries for the subjects
  - So like, a prmpt would be "Marco!" and GPT-3 would of course say... "Polo"
  - libertyprime: @alphapapa, I also have a much matured prompt format readme, here: <https://github.com/semiosis/prompts>
  - libertyprime: which can explain 'validator'', etc.
- aindilis: So uh... does GPT-3 know... everything?  in every human and computer language?  I don't understand its role exactly, or its limitations.
  - GPT-3 knows a lot, but not all, from my experience.  It's pretty scary, in a good way.  I think libertyprime wants to keep it libre.
  - libertyprime: the latest language models such as Codex are world language + codex, and they know everything at an abstract level, like a human does, in a way. So their depth may be superficial. They're pretty good knowledge aggregators.
- so libertyprime can you just tab complete and it completes on like the previous sentence, region, buffer, etc?
  - libertyprime: Yes, it has basic autocompletion functions, (word, line, lines). I'm also making more interesting autocompletion functions, which do beam-search on downstream generations, -- calling it desirable-search.  <http://github.com/semiosis/pen.el/blob/master/src/pen-example-config.el>
  - libertyprime: There are some key binding definitions here which will work for the docker container
- Does GPT-3 "know" how to transliterate from say public code written in JS / Other-Lang to elisp if you were trying to imaginary code similar function names?
  - libertyprime: yes, it absolutely can. transpilation is one thing it is very good at. But more bizarrely, you can also transpile intermediary languages, that are composed of multiple different language chimerically. For example, you can smash out your algorithm with a combination of elisp and bash and it will understand when it transpiles into a real language.
- How well does it actually work to write a function in a mishmash of Bash and Elisp?  I can't imagine that working well in practice. There are too many semantic differences in the languages and implementations
  - libertyprime: it's a very new sort of thing, but feels natural as you are doing it, to generate code. the results of generating code should most probably be looked at before running. that beign said, you can also run 'ieval' around it to run it in inference. I think the takeawaay should be that these models are getting better and better and show no signs yet of reducing quality of results or ability -- no sign yet
- how does lexical binding affect things, if at all?
- How about going from a CLOS/EIEIO style of OO to Java / C++ style? Or Erlang style of parameter pattern matching?
- so IIUC GPT-3 is a service run on a remote system, right?  And it's proprietary?  How big is it in terms of bytes?
  - libertyprime: yes, aggregated language models are not good in my opinion. GPT-3 is around 170 GB, approximately 1GB per million parameters, IIUC
  - libertyprime: There are libre models, and you can connect one to penel to run the inference etc. My goal is to decentralise them though
  - libertyprime: Because I don't think that 170GB is accessible enough. The issue is actually running the models though. You need a very large computer indeed for that
  - libertyprime: I can do a customized demo if anyone wants
- can someone here provide some sample input, and you run it and paste the result, just to give an idea of the quality?  or do you already have samples online?
- here's an idea for a demo... something like (idefun (string target-language) "Translate STRING from its source language into TARGET-LANGUAGE and output it to the echo area.")
  - oops I forgot to name the function, was thinking of ilambda
  - I have a feeling that such a large scope for the function will exceed the max output size of the model.  maybe we work on a more realistic example?
  - I was hoping the model would solve all the messy problems for me :)
  - libertyprime: Oh crud. I hope I havent broken Ilambda. Lol I added support for 0 arguments, it makes it variadic. This will work
  - doesn't seem like it quite understood the purpose but I can see the connection
- what happens if you change target lang to "Elisp" &gt;:)
  - look at the echo area if you didn't notice it
  - oh wait, I missed the echo area
  - libertyprime: Yup, exactly, that will work too. One sec
- can you run the function again or show "C-h e"?  And can we see the resulting source code?
  - libertyprime: translate python to elisp
  - libertyprime: just with (idefun translate)
  - libertyprime: No docstring, etc. or arguments.
  - libertyprime: ccrud. It didnt work haha
  - libertyprime: Sigh.
- libertyprime: I need to fix the 2-ary argument thing. :S Really sorry I think I broke it
- I'd like to see the generated (or "imagined") Elisp source code, assuming it does some HTTP API queries to do the translation and such
- libertyprime: Yup, I can show that. It works much better when I use OpenAI Codex. Here are some generated functions
- libertyprime: That's how it works under the hood. Then it cuts out the bit that you want
- This reminds me of the classical AI paradigm of "generate and check."
- libertyprime: Sigh. I really cry when  demos break. Sorry. I demo'd the underlying prompt though. I broke ilambda, i think
- I think I saw it generate a huge fibonacci function, is that still in your kill-ring?
- okay, well thanks for demoing, the code is pretty stable though at this point right?  this is just the norm with any demo.
- I bet people would be glad to watch/read something later on if you want time to work on it.
- libertyprime: <https://semiosis.github.io/cterm/> This is what I call the complex terminal. Essentially you prefix any terminal program with ct and you get autocompletion etc. for anything. it uses emac's term-mode
- libertyprime: <https://semiosis.github.io/ii/> And this, ii, it's fully imaginary terminals, so you can import imaginary libraries, etc. and work with them.
- libertyprime: <https://semiosis.github.io/apostrophe/> This one here, which imagines conversations with anyone that the model knows about. So I'm demoing having a 3way conversation with amber heard, tom cruise and chris nolan.
  - so you used GPT to generate a compliment, and now GPT generates the convo from that prompt?
  - libertyprime: Yeah, so the best way to interact with these types of chatbots is to imagine the situation you are in before hand. the initial phrases can be anything you can think of really. Why are you in the bath tub?, for example. But I tend to open with something like, may I interrupt? What were you just working on? so by choosing the prompt very carefully, you can tease out the information you require.
- libertyprime: <https://semiosis.github.io/nlsh/> and this, which is a natural language shell
- libertyprime: I also have a way to filter results semantically, with my semantic search prompt <http://github.com/semiosis/prompts/blob/master/prompts/textual-semantic-search-filter-2.prompt>
- libertyprime: YOu can run all these prompts also from bash like so: pl "[\"It's cool. I used to dance zouk.\",\"I don't know.\",\"I'm not sure.\",\"I can't stop dancing to it.\",\"I think it's ok.\",\"It's cool but I prefer rock and roll.\",\"I don't know. It sounds good.\",\"Nice but a bit too fast,\"Oh, I know zouk, you can teach it to me.\",\"Zouk is nice.\"]" | "penf" "-u" "pf-textual-semantic-search-filter/2" "positive response".  That will pipe json results into Pen.el, and have it filtered. all prompting functions are also available as shell commands.
- well I think this is the coolest thing I've seen in a long time.  how do we follow up with you and get involved?  run it etc?
- libertyprime: hehe thanks aindilis: i'm on #emacs as libertyprime. Feel free to hit me up any time. Otherwise, the setup for pen.el is fairly straight forward. If you have any issues demoing, I'd be very interested, so I can make Pen.el more reliable. I have a discord server. I'll copy the link. One sec
- Do you think you could run an IRC channel too?
  - libertyprime: <https://discord.gg/sKR8h9NT>
- Thanks a lot, very interesting and I am excited to learn more later!
- yeah this talk was crazy good, ty!

IRC:

- What Shane is saying right now reminds me a lot of the SICP opening words, about how programming, and computing ideas in general are all about dreams and magic. Creating an idealized solution from abstractions and building blocks.
- This also reminds me of the concept of Humane Tech. Technology, and frameworks that are inherently conducive to human curiosity, intelligent, and all the best traits. <https://github.com/humanetech-community/awesome-humane-tech>
- I think this is like semantic auto-complete on steroids, like tab completion of whatever your typing, or translation of something you've written into code for instance.
- If you're worried about these kind of advances in AI, just remind yourself of how easily technology breaks
- oh my god, executing code derived directly from GPT-3?! that's *lunatic*  curl | bash, eat your heart out
- idefun definitely helped by a docstring
  - yeah that's a use-case, gen from docstring
- Man, I really think it would be awesome to have shane be able to explain some of these ideas more in depth as they are obviously very deep topics. I'd love to help contribute next year to possibly creating a way to have multiple talks going on at once so people have more time to speak. I believe it was sachac who mentioned it yesterday.
- This vaguely reminds me of that one Python package that generates a CLI parser from the help string except that that python package actually made sense
-  re slide 27, would it mean that 2 such "idefined" function would be the "same", meaning do the same thing the same way, given that they are defined without a "body"?
- the full abstraction would look something like an interactive proof program where you could repeatedly refine the results until it matched what the user wanted
- it started incomprehensible and then moved straight to impossible magic.
- wow...mind blown even though that went by a bit too quick.
- Hmm, I do think we could do test-driven imaginary programming tho i.e. you only define the ERT testcases and then do the rest with idefun
- So `(imacro with-clifford-algebra (p q r))` would just "work"... this does feel too magical
- I am really happy that someone is trying Deep Learning stuff *with* emacs and not just for writing Python code :D
- well I've had pretty good success with GPT-3, I think this also supports GPT-j which is I think free/libre.
- most users of GPT-3 do it via calls to a web api
- is it still invite only?
  - no, it's been opened recently



# Outline

-   5-10 minutes:
-   a 5 minute introduction to imaginary programming, followed by
    -   a demonstration of iLambda.
        -   iλ, a family of imaginary programming libraries
        <https://mullikine.github.io/posts/designing-an-imaginary-programming-ip-library-for-emacs/>

<!--20 minutes:

-   a 5 minute introduction to imaginary programming, followed by
    
    -   a 5 minute introduction and demonstration of Pen.el.
        -   <https://semiosis.github.io/pen/>
    -   a 5 minute org-babel and emacs lisp demonstration of iLambda.
        
        -   iλ, a family of imaginary programming libraries
        
        <https://mullikine.github.io/posts/designing-an-imaginary-programming-ip-library-for-emacs/>
    -   a 5 minute demonstration of ‘cterm’ (complex term) and ‘ii’
    
    (imaginary interpreter).
    
    -   <https://mullikine.github.io/posts/imaginary-real-codex-complex/>
    -   <https://semiosis.github.io/ii/>
    
    -

40 minutes:

-   a 10 minute introduction to language models, their capabilities and
    imaginary programming.
    
    -   a 10 minute introduction to creating prompts with Pen.el.
    -   a 5 minute org-babel and emacs lisp demonstration of iLambda.
        
        -   iλ, a family of imaginary programming libraries
        
        <https://mullikine.github.io/posts/designing-an-imaginary-programming-ip-library-for-emacs/>
    -   a 5 minute demonstration of ‘cterm’ (complex term) and ‘ii’
    
    (imaginary interpreter).
    
    -   <https://mullikine.github.io/posts/imaginary-real-codex-complex/>
    -   <https://semiosis.github.io/ii/>
    
    -   A 5 minute brief on examplary and advanced prompt programming.
        -   <https://semiosis.github.io/examplary/>
    -   5 minutes for Prompting Requests and Q&A

Availability

(during the conference days (Nov 27 and 28))

All hours.

How you’d like to handle questions

Live web conference

--->

IRC libertyprime at #emacs on libera

Shane Mulligan

## Links

- Pen.el Tutorial: <https://semiosis.github.io/posts/pen-el-tutorial/>

[[!inline pages="internal(2021/captions/imaginary)" raw="yes"]]

[[!inline pages="internal(2021/info/imaginary-nav)" raw="yes"]]