summaryrefslogtreecommitdiffstats
path: root/2023/captions/emacsconf-2023-matplotllm--matplotllm-iterative-natural-language-data-visualization-in-orgbabel--abhinav-tushar--main.vtt
blob: a01ffd802745c412a7649c682f6a7abb4dcc9e12 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
WEBVTT captioned by sachac, checked by sachac

NOTE Introduction

00:00:00.000 --> 00:00:03.039
Hi, my name is Abhinav and I'm going to talk about

00:00:03.040 --> 00:00:06.199
this tool that I've been working on called MatplotLLM.

00:00:06.200 --> 00:00:09.519
MatplotLLM is a natural language interface

00:00:09.520 --> 00:00:12.479
over matplotlib, which is a library I use a lot

00:00:12.480 --> 00:00:14.439
for making visualizations.

00:00:14.440 --> 00:00:18.679
It's a pretty common Python library used a lot everywhere

00:00:18.680 --> 00:00:22.479
where there's need of plotting and graphing.

00:00:22.480 --> 00:00:25.359
I usually use it in reports.

00:00:25.360 --> 00:00:27.359
Whenever I'm writing a report in org mode,

00:00:27.360 --> 00:00:31.559
I tend to write a code block which is in Python.

00:00:31.560 --> 00:00:34.079
And then that code block has usage of matplotlib

00:00:34.080 --> 00:00:35.999
to produce some reports.

00:00:36.000 --> 00:00:38.319
That works really well.

00:00:38.320 --> 00:00:39.999
But at times what happens is

00:00:40.000 --> 00:00:43.959
I have to make a very custom graph, let's say.

00:00:43.960 --> 00:00:46.919
And then while I'm writing a report,

00:00:46.920 --> 00:00:50.679
it's kind of a huge leap of abstraction

00:00:50.680 --> 00:00:51.519
when I'm working on text

00:00:51.520 --> 00:00:54.879
versus going into actual low-level matplotlib code

00:00:54.880 --> 00:00:56.239
to do that graphing.

00:00:56.240 --> 00:00:59.679
So that's something I don't want to do.

00:00:59.680 --> 00:01:00.479
Here's an example.

00:01:00.480 --> 00:01:03.999
This is a graph which is... I think it was made

00:01:04.000 --> 00:01:05.839
like five or six years back.

00:01:05.840 --> 00:01:08.399
And then there are some common things

00:01:08.400 --> 00:01:09.959
like scatter plot here,

00:01:09.960 --> 00:01:12.239
the dots that you can see here scattered.

00:01:12.240 --> 00:01:16.279
Then... But there are a few things which, to do them,

00:01:16.280 --> 00:01:19.159
to make them, you will actually have to go--at least me,

00:01:19.160 --> 00:01:20.839
I have to go to the documentation

00:01:20.840 --> 00:01:24.119
and figure out how to do it. Which is fine,

00:01:24.120 --> 00:01:26.519
but I don't want to do this, you know,

00:01:26.520 --> 00:01:29.199
spend so much time here, when I'm working on

00:01:29.200 --> 00:01:32.319
a tight deadline for a report.

00:01:32.320 --> 00:01:33.919
That's the motivation for this tool.

00:01:33.920 --> 00:01:35.199
This tool basically allows me

00:01:35.200 --> 00:01:38.479
to get rid of the complexity of the library

00:01:38.480 --> 00:01:40.719
by working via an LLM.

NOTE What is an LLM?

00:01:40.720 --> 00:01:43.399
So an LLM is a large language model.

00:01:43.400 --> 00:01:45.079
These are models which are

00:01:45.080 --> 00:01:49.399
trained to produce text, generate text.

00:01:49.400 --> 00:01:51.519
And just by doing that,

00:01:51.520 --> 00:01:55.079
they actually end up learning a lot of common patterns.

00:01:55.080 --> 00:01:56.799
For example, if you ask a question,

00:01:56.800 --> 00:01:58.919
you can actually get a reasonable response.

00:01:58.920 --> 00:02:00.759
If you ask to write a code for something,

00:02:00.760 --> 00:02:01.879
you'll actually get code

00:02:01.880 --> 00:02:04.759
which can also be very reasonable.

00:02:04.760 --> 00:02:06.599
So this tool is basically a wrapper

00:02:06.600 --> 00:02:10.999
that uses an LLM. For the current version,

00:02:11.000 --> 00:02:13.919
we use GPT-4, which is OpenAI's model.

00:02:13.920 --> 00:02:17.919
It's not open in the sense of open source.

00:02:17.920 --> 00:02:21.119
So that's a problem that it has.

00:02:21.120 --> 00:02:23.599
But for this version, we are going to use that.

NOTE Using this library

00:02:23.600 --> 00:02:25.479
Using this library is pretty simple.

00:02:25.480 --> 00:02:27.399
You basically require the library

00:02:27.400 --> 00:02:30.719
and then you set up your OpenAI API key here.

00:02:30.720 --> 00:02:33.359
Then you get a code block

00:02:33.360 --> 00:02:35.759
where you can specify the language as `matplotllm`.

00:02:35.760 --> 00:02:38.279
And then what you can do is,

00:02:38.280 --> 00:02:40.799
you can basically describe what you want

00:02:40.800 --> 00:02:41.799
in natural language.

00:02:41.800 --> 00:02:45.279
I'll take this example of this data set.

00:02:45.280 --> 00:02:48.599
It's called the Health and Wealth of Nations.

00:02:48.600 --> 00:02:49.639
I think that was

00:02:49.640 --> 00:02:51.399
the name of a visualization where it was used.

00:02:51.400 --> 00:02:53.399
This is basically life expectancy,

00:02:53.400 --> 00:02:59.279
GDP of various countries starting from 1800.

00:02:59.280 --> 00:03:02.719
I think it goes up to 2000 somewhere.

00:03:02.720 --> 00:03:07.479
So earlier, I would try to write code which reads this CSV

00:03:07.480 --> 00:03:09.839
and then does a lot of matplotlib stuff

00:03:09.840 --> 00:03:11.679
and then finally produces a graph.

00:03:11.680 --> 00:03:13.879
But with this tool, what I'll do is

00:03:13.880 --> 00:03:17.679
I'll just provide instructions in two forms.

00:03:17.680 --> 00:03:18.879
So the first thing I'll do is

00:03:18.880 --> 00:03:21.359
I'll just describe how the data looks like.

00:03:21.360 --> 00:03:29.039
So I'll say data is in a file called `data.csv`,

00:03:29.040 --> 00:03:33.159
which is this file, by the way, on the right.

00:03:33.160 --> 00:03:39.799
It looks like the following.

00:03:39.800 --> 00:03:44.359
I just pasted a few lines from the top, which is enough.

00:03:44.360 --> 00:03:47.119
Since it's a CSV, there's already a structure to it.

00:03:47.120 --> 00:03:50.079
But let's say if you have a log file

00:03:50.080 --> 00:03:53.759
where there's more complexities to be parsed and all,

00:03:53.760 --> 00:03:55.039
that also works out really well.

00:03:55.040 --> 00:03:58.079
You just have to describe how the data looks like

00:03:58.080 --> 00:04:01.159
and the system will figure out how to work with this.

00:04:01.160 --> 00:04:06.404
Now, let's do the plotting. So what I can do is...

00:04:06.405 --> 00:04:09.559
Let's start from a very basic plot

00:04:09.560 --> 00:04:11.620
between life expectancy and GDP per capita.

00:04:11.621 --> 00:04:13.800
I'll just do this.

00:04:13.801 --> 00:04:17.280
"Can you make a scatter plot

00:04:17.281 --> 00:04:26.399
for life expectancy and GDP per capita?"

00:04:26.400 --> 00:04:29.639
Now, you can see there are some typos,

00:04:29.640 --> 00:04:31.719
and probably there will be some grammatical mistakes

00:04:31.720 --> 00:04:32.919
also coming through.

00:04:32.920 --> 00:04:37.119
But that's all OK, because the models are supposed to

00:04:37.120 --> 00:04:40.559
handle those kinds of situations really well.

00:04:40.560 --> 00:04:43.239
So I send the request to the model.

00:04:43.240 --> 00:04:47.119
Since it's a large model--GPT-4 is really large--

00:04:47.120 --> 00:04:50.519
it actually takes a lot of time to get the response back.

00:04:50.520 --> 00:04:53.359
So this specific response took 17 seconds,

00:04:53.360 --> 00:04:54.239
which is huge.

00:04:54.240 --> 00:04:57.439
It's not something you would expect

00:04:57.440 --> 00:04:59.599
in a local file running on a computer.

00:04:59.600 --> 00:05:01.879
But I've got what I wanted. Right.

00:05:01.880 --> 00:05:04.119
So there's a scatter plot here, as you can see below,

00:05:04.120 --> 00:05:08.879
which is plotting what I specified it to do,

00:05:08.880 --> 00:05:11.700
though it looks a little dense.

NOTE Further instructions

00:05:11.701 --> 00:05:12.640
What I can do is

00:05:12.641 --> 00:05:16.000
I can provide further instructions as feedback.

00:05:16.001 --> 00:05:18.400
I try to feed back on this. So I can say,

00:05:18.401 --> 00:05:30.599
"Can you only show points where year is the multiple of 50?"

00:05:30.600 --> 00:05:33.519
So since it's starting from 1800, the data points,

00:05:33.520 --> 00:05:34.719
there are too many years,

00:05:34.720 --> 00:05:37.239
so I'll just try to thin them down a little.

00:05:37.240 --> 00:05:40.199
Now what's happening in the background

00:05:40.200 --> 00:05:42.719
is that everything below this last instruction

00:05:42.720 --> 00:05:45.719
is going out as the context to the model

00:05:45.720 --> 00:05:47.399
along with the code that it wrote till now.

00:05:47.400 --> 00:05:50.079
And then this instruction is added on top of it

00:05:50.080 --> 00:05:53.079
so that it basically modifies the code to make it work

00:05:53.080 --> 00:05:55.079
according to this instruction.

00:05:55.080 --> 00:05:58.439
As you can see now, the data points are much fewer.

00:05:58.440 --> 00:06:01.519
This is what I wanted also.

00:06:01.520 --> 00:06:02.799
Let's also do a few more things.

00:06:02.800 --> 00:06:05.439
I want to see the progression through time.

00:06:05.440 --> 00:06:13.079
So maybe I'll do something like, color more recent years

00:06:13.080 --> 00:06:15.439
with a darker shade of...

00:06:15.440 --> 00:06:21.719
Let's change the color map also.

00:06:21.720 --> 00:06:24.159
Now, this again goes back to the model.

00:06:24.160 --> 00:06:26.799
Again, everything below before this line

00:06:26.800 --> 00:06:29.119
is the context along with the current code,

00:06:29.120 --> 00:06:31.799
and then this instruction is going to the model

00:06:31.800 --> 00:06:37.039
to make the changes. So now this should happen, I guess.

00:06:37.040 --> 00:06:41.319
Once this happens. Yeah. So. OK.

00:06:41.320 --> 00:06:44.599
So we have this new color map,

00:06:44.600 --> 00:06:46.599
and there's also this change of color.

00:06:46.600 --> 00:06:51.719
And also there's this range of color from 1800 to 2000,

00:06:51.720 --> 00:06:53.399
which is a nice addition.

00:06:53.400 --> 00:06:55.839
Kind of smart. I didn't expect...

00:06:55.840 --> 00:06:58.959
I didn't exactly ask for it, but it's nice.

00:06:58.960 --> 00:07:00.959
So there's a couple more things.

00:07:00.960 --> 00:07:07.759
Let's make it more minimal. "Let's make it more minimal.

00:07:07.760 --> 00:07:17.319
Can you remove the bounding box?"

00:07:17.320 --> 00:07:21.399
Also, let's annotate a few points.

00:07:21.400 --> 00:07:23.719
So I want to annotate the point

00:07:23.720 --> 00:07:25.839
which has the highest GDP per capita.

00:07:25.840 --> 00:07:33.599
"Also annotate the point with highest GDP per capita

00:07:33.600 --> 00:07:36.999
with the country and year."

00:07:37.000 --> 00:07:41.599
So again, forget about the grammar.

00:07:41.600 --> 00:07:43.599
The language model works out well.

00:07:43.600 --> 00:07:46.159
Usually it takes care of

00:07:46.160 --> 00:07:47.439
all those complexities for you.

00:07:47.440 --> 00:07:53.119
This is what we have got after that.

00:07:53.120 --> 00:07:55.719
As you can see, there's the annotation, which is here.

00:07:55.720 --> 00:07:56.679
I think it's still overlapping,

00:07:56.680 --> 00:07:58.559
so probably it could be done better,

00:07:58.560 --> 00:08:00.159
but the box is removed.

NOTE Room for improvement

00:08:00.160 --> 00:08:03.359
Now, as you can see, the system is...

00:08:03.360 --> 00:08:04.879
You will be able to see this

00:08:04.880 --> 00:08:07.479
that the system is not really robust.

00:08:07.480 --> 00:08:10.079
So the GitHub repository has some examples

00:08:10.080 --> 00:08:12.119
where it fails miserably,

00:08:12.120 --> 00:08:13.679
and you'll actually have to go into the code

00:08:13.680 --> 00:08:14.999
to figure out what's happening.

00:08:15.000 --> 00:08:17.879
But we do expect that to improve slowly,

00:08:17.880 --> 00:08:21.039
because the models are improving greatly in performance.

00:08:21.040 --> 00:08:22.479
This is a very general model.

00:08:22.480 --> 00:08:24.479
This is not even tuned for this use case.

00:08:24.480 --> 00:08:26.639
The other thing is that

00:08:26.640 --> 00:08:29.639
while I was trying to provide feedback,

00:08:29.640 --> 00:08:32.199
I was still using text here all the time,

00:08:32.200 --> 00:08:34.559
but it can be made more natural.

00:08:34.560 --> 00:08:36.159
So, for example, if I have to annotate

00:08:36.160 --> 00:08:37.439
this particular point,

00:08:37.440 --> 00:08:42.239
I actually can just point my cursor to it.

00:08:42.240 --> 00:08:44.519
Emacs has a way to figure out

00:08:44.520 --> 00:08:45.799
where your mouse pointer is.

00:08:45.800 --> 00:08:49.620
And with that, you can actually go back into the code

00:08:49.621 --> 00:08:51.960
and then see which primitive

00:08:51.961 --> 00:08:54.480
is being drawn here in Matplotlib.

00:08:54.481 --> 00:08:55.719
So that there is a way to do that.

00:08:55.720 --> 00:08:58.439
And then, if you do that, then it's really nice to

00:08:58.440 --> 00:09:01.319
just be able to say

00:09:01.320 --> 00:09:04.279
put your cursor here and then say something like,

00:09:04.280 --> 00:09:04.999
"Can you make this?

00:09:05.000 --> 00:09:06.599
Can you annotate this point?"

00:09:06.600 --> 00:09:10.719
Because text is, you know... There are limitations to text.

00:09:10.720 --> 00:09:12.479
And if you're producing an image,

00:09:12.480 --> 00:09:13.959
you should be able to do that, too.

00:09:13.960 --> 00:09:16.399
So I do expect that to happen soonish.

00:09:16.400 --> 00:09:19.839
If not, from the model side, the hack that I mentioned

00:09:19.840 --> 00:09:21.359
could be made to work.

00:09:21.360 --> 00:09:24.439
So that will come in in a later version, probably.

00:09:24.440 --> 00:09:27.599
Anyway, so that's the end of my talk.

00:09:27.600 --> 00:09:29.759
You can find more details in the repository link.

00:09:29.760 --> 00:09:33.480
Thank you for listening. Goodbye.