WEBVTT captioned by sachac, checked by sachac
NOTE Introduction
00:00:00.000 --> 00:00:03.039
Hi, my name is Abhinav and I'm going to talk about
00:00:03.040 --> 00:00:06.199
this tool that I've been working on called MatplotLLM.
00:00:06.200 --> 00:00:09.519
MatplotLLM is a natural language interface
00:00:09.520 --> 00:00:12.479
over matplotlib, which is a library I use a lot
00:00:12.480 --> 00:00:14.439
for making visualizations.
00:00:14.440 --> 00:00:18.679
It's a pretty common Python library used a lot everywhere
00:00:18.680 --> 00:00:22.479
where there's need of plotting and graphing.
00:00:22.480 --> 00:00:25.359
I usually use it in reports.
00:00:25.360 --> 00:00:27.359
Whenever I'm writing a report in org mode,
00:00:27.360 --> 00:00:31.559
I tend to write a code block which is in Python.
00:00:31.560 --> 00:00:34.079
And then that code block has usage of matplotlib
00:00:34.080 --> 00:00:35.999
to produce some reports.
00:00:36.000 --> 00:00:38.319
That works really well.
00:00:38.320 --> 00:00:39.999
But at times what happens is
00:00:40.000 --> 00:00:43.959
I have to make a very custom graph, let's say.
00:00:43.960 --> 00:00:46.919
And then while I'm writing a report,
00:00:46.920 --> 00:00:50.679
it's kind of a huge leap of abstraction
00:00:50.680 --> 00:00:51.519
when I'm working on text
00:00:51.520 --> 00:00:54.879
versus going into actual low-level matplotlib code
00:00:54.880 --> 00:00:56.239
to do that graphing.
00:00:56.240 --> 00:00:59.679
So that's something I don't want to do.
00:00:59.680 --> 00:01:00.479
Here's an example.
00:01:00.480 --> 00:01:03.999
This is a graph which is... I think it was made
00:01:04.000 --> 00:01:05.839
like five or six years back.
00:01:05.840 --> 00:01:08.399
And then there are some common things
00:01:08.400 --> 00:01:09.959
like scatter plot here,
00:01:09.960 --> 00:01:12.239
the dots that you can see here scattered.
00:01:12.240 --> 00:01:16.279
Then... But there are a few things which, to do them,
00:01:16.280 --> 00:01:19.159
to make them, you will actually have to go--at least me,
00:01:19.160 --> 00:01:20.839
I have to go to the documentation
00:01:20.840 --> 00:01:24.119
and figure out how to do it. Which is fine,
00:01:24.120 --> 00:01:26.519
but I don't want to do this, you know,
00:01:26.520 --> 00:01:29.199
spend so much time here, when I'm working on
00:01:29.200 --> 00:01:32.319
a tight deadline for a report.
00:01:32.320 --> 00:01:33.919
That's the motivation for this tool.
00:01:33.920 --> 00:01:35.199
This tool basically allows me
00:01:35.200 --> 00:01:38.479
to get rid of the complexity of the library
00:01:38.480 --> 00:01:40.719
by working via an LLM.
NOTE What is an LLM?
00:01:40.720 --> 00:01:43.399
So an LLM is a large language model.
00:01:43.400 --> 00:01:45.079
These are models which are
00:01:45.080 --> 00:01:49.399
trained to produce text, generate text.
00:01:49.400 --> 00:01:51.519
And just by doing that,
00:01:51.520 --> 00:01:55.079
they actually end up learning a lot of common patterns.
00:01:55.080 --> 00:01:56.799
For example, if you ask a question,
00:01:56.800 --> 00:01:58.919
you can actually get a reasonable response.
00:01:58.920 --> 00:02:00.759
If you ask to write a code for something,
00:02:00.760 --> 00:02:01.879
you'll actually get code
00:02:01.880 --> 00:02:04.759
which can also be very reasonable.
00:02:04.760 --> 00:02:06.599
So this tool is basically a wrapper
00:02:06.600 --> 00:02:10.999
that uses an LLM. For the current version,
00:02:11.000 --> 00:02:13.919
we use GPT-4, which is OpenAI's model.
00:02:13.920 --> 00:02:17.919
It's not open in the sense of open source.
00:02:17.920 --> 00:02:21.119
So that's a problem that it has.
00:02:21.120 --> 00:02:23.599
But for this version, we are going to use that.
NOTE Using this library
00:02:23.600 --> 00:02:25.479
Using this library is pretty simple.
00:02:25.480 --> 00:02:27.399
You basically require the library
00:02:27.400 --> 00:02:30.719
and then you set up your OpenAI API key here.
00:02:30.720 --> 00:02:33.359
Then you get a code block
00:02:33.360 --> 00:02:35.759
where you can specify the language as `matplotllm`.
00:02:35.760 --> 00:02:38.279
And then what you can do is,
00:02:38.280 --> 00:02:40.799
you can basically describe what you want
00:02:40.800 --> 00:02:41.799
in natural language.
00:02:41.800 --> 00:02:45.279
I'll take this example of this data set.
00:02:45.280 --> 00:02:48.599
It's called the Health and Wealth of Nations.
00:02:48.600 --> 00:02:49.639
I think that was
00:02:49.640 --> 00:02:51.399
the name of a visualization where it was used.
00:02:51.400 --> 00:02:53.399
This is basically life expectancy,
00:02:53.400 --> 00:02:59.279
GDP of various countries starting from 1800.
00:02:59.280 --> 00:03:02.719
I think it goes up to 2000 somewhere.
00:03:02.720 --> 00:03:07.479
So earlier, I would try to write code which reads this CSV
00:03:07.480 --> 00:03:09.839
and then does a lot of matplotlib stuff
00:03:09.840 --> 00:03:11.679
and then finally produces a graph.
00:03:11.680 --> 00:03:13.879
But with this tool, what I'll do is
00:03:13.880 --> 00:03:17.679
I'll just provide instructions in two forms.
00:03:17.680 --> 00:03:18.879
So the first thing I'll do is
00:03:18.880 --> 00:03:21.359
I'll just describe how the data looks like.
00:03:21.360 --> 00:03:29.039
So I'll say data is in a file called `data.csv`,
00:03:29.040 --> 00:03:33.159
which is this file, by the way, on the right.
00:03:33.160 --> 00:03:39.799
It looks like the following.
00:03:39.800 --> 00:03:44.359
I just pasted a few lines from the top, which is enough.
00:03:44.360 --> 00:03:47.119
Since it's a CSV, there's already a structure to it.
00:03:47.120 --> 00:03:50.079
But let's say if you have a log file
00:03:50.080 --> 00:03:53.759
where there's more complexities to be parsed and all,
00:03:53.760 --> 00:03:55.039
that also works out really well.
00:03:55.040 --> 00:03:58.079
You just have to describe how the data looks like
00:03:58.080 --> 00:04:01.159
and the system will figure out how to work with this.
00:04:01.160 --> 00:04:06.404
Now, let's do the plotting. So what I can do is...
00:04:06.405 --> 00:04:09.559
Let's start from a very basic plot
00:04:09.560 --> 00:04:11.620
between life expectancy and GDP per capita.
00:04:11.621 --> 00:04:13.800
I'll just do this.
00:04:13.801 --> 00:04:17.280
"Can you make a scatter plot
00:04:17.281 --> 00:04:26.399
for life expectancy and GDP per capita?"
00:04:26.400 --> 00:04:29.639
Now, you can see there are some typos,
00:04:29.640 --> 00:04:31.719
and probably there will be some grammatical mistakes
00:04:31.720 --> 00:04:32.919
also coming through.
00:04:32.920 --> 00:04:37.119
But that's all OK, because the models are supposed to
00:04:37.120 --> 00:04:40.559
handle those kinds of situations really well.
00:04:40.560 --> 00:04:43.239
So I send the request to the model.
00:04:43.240 --> 00:04:47.119
Since it's a large model--GPT-4 is really large--
00:04:47.120 --> 00:04:50.519
it actually takes a lot of time to get the response back.
00:04:50.520 --> 00:04:53.359
So this specific response took 17 seconds,
00:04:53.360 --> 00:04:54.239
which is huge.
00:04:54.240 --> 00:04:57.439
It's not something you would expect
00:04:57.440 --> 00:04:59.599
in a local file running on a computer.
00:04:59.600 --> 00:05:01.879
But I've got what I wanted. Right.
00:05:01.880 --> 00:05:04.119
So there's a scatter plot here, as you can see below,
00:05:04.120 --> 00:05:08.879
which is plotting what I specified it to do,
00:05:08.880 --> 00:05:11.700
though it looks a little dense.
NOTE Further instructions
00:05:11.701 --> 00:05:12.640
What I can do is
00:05:12.641 --> 00:05:16.000
I can provide further instructions as feedback.
00:05:16.001 --> 00:05:18.400
I try to feed back on this. So I can say,
00:05:18.401 --> 00:05:30.599
"Can you only show points where year is the multiple of 50?"
00:05:30.600 --> 00:05:33.519
So since it's starting from 1800, the data points,
00:05:33.520 --> 00:05:34.719
there are too many years,
00:05:34.720 --> 00:05:37.239
so I'll just try to thin them down a little.
00:05:37.240 --> 00:05:40.199
Now what's happening in the background
00:05:40.200 --> 00:05:42.719
is that everything below this last instruction
00:05:42.720 --> 00:05:45.719
is going out as the context to the model
00:05:45.720 --> 00:05:47.399
along with the code that it wrote till now.
00:05:47.400 --> 00:05:50.079
And then this instruction is added on top of it
00:05:50.080 --> 00:05:53.079
so that it basically modifies the code to make it work
00:05:53.080 --> 00:05:55.079
according to this instruction.
00:05:55.080 --> 00:05:58.439
As you can see now, the data points are much fewer.
00:05:58.440 --> 00:06:01.519
This is what I wanted also.
00:06:01.520 --> 00:06:02.799
Let's also do a few more things.
00:06:02.800 --> 00:06:05.439
I want to see the progression through time.
00:06:05.440 --> 00:06:13.079
So maybe I'll do something like, color more recent years
00:06:13.080 --> 00:06:15.439
with a darker shade of...
00:06:15.440 --> 00:06:21.719
Let's change the color map also.
00:06:21.720 --> 00:06:24.159
Now, this again goes back to the model.
00:06:24.160 --> 00:06:26.799
Again, everything below before this line
00:06:26.800 --> 00:06:29.119
is the context along with the current code,
00:06:29.120 --> 00:06:31.799
and then this instruction is going to the model
00:06:31.800 --> 00:06:37.039
to make the changes. So now this should happen, I guess.
00:06:37.040 --> 00:06:41.319
Once this happens. Yeah. So. OK.
00:06:41.320 --> 00:06:44.599
So we have this new color map,
00:06:44.600 --> 00:06:46.599
and there's also this change of color.
00:06:46.600 --> 00:06:51.719
And also there's this range of color from 1800 to 2000,
00:06:51.720 --> 00:06:53.399
which is a nice addition.
00:06:53.400 --> 00:06:55.839
Kind of smart. I didn't expect...
00:06:55.840 --> 00:06:58.959
I didn't exactly ask for it, but it's nice.
00:06:58.960 --> 00:07:00.959
So there's a couple more things.
00:07:00.960 --> 00:07:07.759
Let's make it more minimal. "Let's make it more minimal.
00:07:07.760 --> 00:07:17.319
Can you remove the bounding box?"
00:07:17.320 --> 00:07:21.399
Also, let's annotate a few points.
00:07:21.400 --> 00:07:23.719
So I want to annotate the point
00:07:23.720 --> 00:07:25.839
which has the highest GDP per capita.
00:07:25.840 --> 00:07:33.599
"Also annotate the point with highest GDP per capita
00:07:33.600 --> 00:07:36.999
with the country and year."
00:07:37.000 --> 00:07:41.599
So again, forget about the grammar.
00:07:41.600 --> 00:07:43.599
The language model works out well.
00:07:43.600 --> 00:07:46.159
Usually it takes care of
00:07:46.160 --> 00:07:47.439
all those complexities for you.
00:07:47.440 --> 00:07:53.119
This is what we have got after that.
00:07:53.120 --> 00:07:55.719
As you can see, there's the annotation, which is here.
00:07:55.720 --> 00:07:56.679
I think it's still overlapping,
00:07:56.680 --> 00:07:58.559
so probably it could be done better,
00:07:58.560 --> 00:08:00.159
but the box is removed.
NOTE Room for improvement
00:08:00.160 --> 00:08:03.359
Now, as you can see, the system is...
00:08:03.360 --> 00:08:04.879
You will be able to see this
00:08:04.880 --> 00:08:07.479
that the system is not really robust.
00:08:07.480 --> 00:08:10.079
So the GitHub repository has some examples
00:08:10.080 --> 00:08:12.119
where it fails miserably,
00:08:12.120 --> 00:08:13.679
and you'll actually have to go into the code
00:08:13.680 --> 00:08:14.999
to figure out what's happening.
00:08:15.000 --> 00:08:17.879
But we do expect that to improve slowly,
00:08:17.880 --> 00:08:21.039
because the models are improving greatly in performance.
00:08:21.040 --> 00:08:22.479
This is a very general model.
00:08:22.480 --> 00:08:24.479
This is not even tuned for this use case.
00:08:24.480 --> 00:08:26.639
The other thing is that
00:08:26.640 --> 00:08:29.639
while I was trying to provide feedback,
00:08:29.640 --> 00:08:32.199
I was still using text here all the time,
00:08:32.200 --> 00:08:34.559
but it can be made more natural.
00:08:34.560 --> 00:08:36.159
So, for example, if I have to annotate
00:08:36.160 --> 00:08:37.439
this particular point,
00:08:37.440 --> 00:08:42.239
I actually can just point my cursor to it.
00:08:42.240 --> 00:08:44.519
Emacs has a way to figure out
00:08:44.520 --> 00:08:45.799
where your mouse pointer is.
00:08:45.800 --> 00:08:49.620
And with that, you can actually go back into the code
00:08:49.621 --> 00:08:51.960
and then see which primitive
00:08:51.961 --> 00:08:54.480
is being drawn here in Matplotlib.
00:08:54.481 --> 00:08:55.719
So that there is a way to do that.
00:08:55.720 --> 00:08:58.439
And then, if you do that, then it's really nice to
00:08:58.440 --> 00:09:01.319
just be able to say
00:09:01.320 --> 00:09:04.279
put your cursor here and then say something like,
00:09:04.280 --> 00:09:04.999
"Can you make this?
00:09:05.000 --> 00:09:06.599
Can you annotate this point?"
00:09:06.600 --> 00:09:10.719
Because text is, you know... There are limitations to text.
00:09:10.720 --> 00:09:12.479
And if you're producing an image,
00:09:12.480 --> 00:09:13.959
you should be able to do that, too.
00:09:13.960 --> 00:09:16.399
So I do expect that to happen soonish.
00:09:16.400 --> 00:09:19.839
If not, from the model side, the hack that I mentioned
00:09:19.840 --> 00:09:21.359
could be made to work.
00:09:21.360 --> 00:09:24.439
So that will come in in a later version, probably.
00:09:24.440 --> 00:09:27.599
Anyway, so that's the end of my talk.
00:09:27.600 --> 00:09:29.759
You can find more details in the repository link.
00:09:29.760 --> 00:09:33.480
Thank you for listening. Goodbye.