WEBVTT captioned by sachac, checked by sachac NOTE Introduction 00:00:00.000 --> 00:00:03.039 Hi, my name is Abhinav and I'm going to talk about 00:00:03.040 --> 00:00:06.199 this tool that I've been working on called MatplotLLM. 00:00:06.200 --> 00:00:09.519 MatplotLLM is a natural language interface 00:00:09.520 --> 00:00:12.479 over matplotlib, which is a library I use a lot 00:00:12.480 --> 00:00:14.439 for making visualizations. 00:00:14.440 --> 00:00:18.679 It's a pretty common Python library used a lot everywhere 00:00:18.680 --> 00:00:22.479 where there's need of plotting and graphing. 00:00:22.480 --> 00:00:25.359 I usually use it in reports. 00:00:25.360 --> 00:00:27.359 Whenever I'm writing a report in org mode, 00:00:27.360 --> 00:00:31.559 I tend to write a code block which is in Python. 00:00:31.560 --> 00:00:34.079 And then that code block has usage of matplotlib 00:00:34.080 --> 00:00:35.999 to produce some reports. 00:00:36.000 --> 00:00:38.319 That works really well. 00:00:38.320 --> 00:00:39.999 But at times what happens is 00:00:40.000 --> 00:00:43.959 I have to make a very custom graph, let's say. 00:00:43.960 --> 00:00:46.919 And then while I'm writing a report, 00:00:46.920 --> 00:00:50.679 it's kind of a huge leap of abstraction 00:00:50.680 --> 00:00:51.519 when I'm working on text 00:00:51.520 --> 00:00:54.879 versus going into actual low-level matplotlib code 00:00:54.880 --> 00:00:56.239 to do that graphing. 00:00:56.240 --> 00:00:59.679 So that's something I don't want to do. 00:00:59.680 --> 00:01:00.479 Here's an example. 00:01:00.480 --> 00:01:03.999 This is a graph which is... I think it was made 00:01:04.000 --> 00:01:05.839 like five or six years back. 00:01:05.840 --> 00:01:08.399 And then there are some common things 00:01:08.400 --> 00:01:09.959 like scatter plot here, 00:01:09.960 --> 00:01:12.239 the dots that you can see here scattered. 00:01:12.240 --> 00:01:16.279 Then... But there are a few things which, to do them, 00:01:16.280 --> 00:01:19.159 to make them, you will actually have to go--at least me, 00:01:19.160 --> 00:01:20.839 I have to go to the documentation 00:01:20.840 --> 00:01:24.119 and figure out how to do it. Which is fine, 00:01:24.120 --> 00:01:26.519 but I don't want to do this, you know, 00:01:26.520 --> 00:01:29.199 spend so much time here, when I'm working on 00:01:29.200 --> 00:01:32.319 a tight deadline for a report. 00:01:32.320 --> 00:01:33.919 That's the motivation for this tool. 00:01:33.920 --> 00:01:35.199 This tool basically allows me 00:01:35.200 --> 00:01:38.479 to get rid of the complexity of the library 00:01:38.480 --> 00:01:40.719 by working via an LLM. NOTE What is an LLM? 00:01:40.720 --> 00:01:43.399 So an LLM is a large language model. 00:01:43.400 --> 00:01:45.079 These are models which are 00:01:45.080 --> 00:01:49.399 trained to produce text, generate text. 00:01:49.400 --> 00:01:51.519 And just by doing that, 00:01:51.520 --> 00:01:55.079 they actually end up learning a lot of common patterns. 00:01:55.080 --> 00:01:56.799 For example, if you ask a question, 00:01:56.800 --> 00:01:58.919 you can actually get a reasonable response. 00:01:58.920 --> 00:02:00.759 If you ask to write a code for something, 00:02:00.760 --> 00:02:01.879 you'll actually get code 00:02:01.880 --> 00:02:04.759 which can also be very reasonable. 00:02:04.760 --> 00:02:06.599 So this tool is basically a wrapper 00:02:06.600 --> 00:02:10.999 that uses an LLM. For the current version, 00:02:11.000 --> 00:02:13.919 we use GPT-4, which is OpenAI's model. 00:02:13.920 --> 00:02:17.919 It's not open in the sense of open source. 00:02:17.920 --> 00:02:21.119 So that's a problem that it has. 00:02:21.120 --> 00:02:23.599 But for this version, we are going to use that. NOTE Using this library 00:02:23.600 --> 00:02:25.479 Using this library is pretty simple. 00:02:25.480 --> 00:02:27.399 You basically require the library 00:02:27.400 --> 00:02:30.719 and then you set up your OpenAI API key here. 00:02:30.720 --> 00:02:33.359 Then you get a code block 00:02:33.360 --> 00:02:35.759 where you can specify the language as `matplotllm`. 00:02:35.760 --> 00:02:38.279 And then what you can do is, 00:02:38.280 --> 00:02:40.799 you can basically describe what you want 00:02:40.800 --> 00:02:41.799 in natural language. 00:02:41.800 --> 00:02:45.279 I'll take this example of this data set. 00:02:45.280 --> 00:02:48.599 It's called the Health and Wealth of Nations. 00:02:48.600 --> 00:02:49.639 I think that was 00:02:49.640 --> 00:02:51.399 the name of a visualization where it was used. 00:02:51.400 --> 00:02:53.399 This is basically life expectancy, 00:02:53.400 --> 00:02:59.279 GDP of various countries starting from 1800. 00:02:59.280 --> 00:03:02.719 I think it goes up to 2000 somewhere. 00:03:02.720 --> 00:03:07.479 So earlier, I would try to write code which reads this CSV 00:03:07.480 --> 00:03:09.839 and then does a lot of matplotlib stuff 00:03:09.840 --> 00:03:11.679 and then finally produces a graph. 00:03:11.680 --> 00:03:13.879 But with this tool, what I'll do is 00:03:13.880 --> 00:03:17.679 I'll just provide instructions in two forms. 00:03:17.680 --> 00:03:18.879 So the first thing I'll do is 00:03:18.880 --> 00:03:21.359 I'll just describe how the data looks like. 00:03:21.360 --> 00:03:29.039 So I'll say data is in a file called `data.csv`, 00:03:29.040 --> 00:03:33.159 which is this file, by the way, on the right. 00:03:33.160 --> 00:03:39.799 It looks like the following. 00:03:39.800 --> 00:03:44.359 I just pasted a few lines from the top, which is enough. 00:03:44.360 --> 00:03:47.119 Since it's a CSV, there's already a structure to it. 00:03:47.120 --> 00:03:50.079 But let's say if you have a log file 00:03:50.080 --> 00:03:53.759 where there's more complexities to be parsed and all, 00:03:53.760 --> 00:03:55.039 that also works out really well. 00:03:55.040 --> 00:03:58.079 You just have to describe how the data looks like 00:03:58.080 --> 00:04:01.159 and the system will figure out how to work with this. 00:04:01.160 --> 00:04:06.404 Now, let's do the plotting. So what I can do is... 00:04:06.405 --> 00:04:09.559 Let's start from a very basic plot 00:04:09.560 --> 00:04:11.620 between life expectancy and GDP per capita. 00:04:11.621 --> 00:04:13.800 I'll just do this. 00:04:13.801 --> 00:04:17.280 "Can you make a scatter plot 00:04:17.281 --> 00:04:26.399 for life expectancy and GDP per capita?" 00:04:26.400 --> 00:04:29.639 Now, you can see there are some typos, 00:04:29.640 --> 00:04:31.719 and probably there will be some grammatical mistakes 00:04:31.720 --> 00:04:32.919 also coming through. 00:04:32.920 --> 00:04:37.119 But that's all OK, because the models are supposed to 00:04:37.120 --> 00:04:40.559 handle those kinds of situations really well. 00:04:40.560 --> 00:04:43.239 So I send the request to the model. 00:04:43.240 --> 00:04:47.119 Since it's a large model--GPT-4 is really large-- 00:04:47.120 --> 00:04:50.519 it actually takes a lot of time to get the response back. 00:04:50.520 --> 00:04:53.359 So this specific response took 17 seconds, 00:04:53.360 --> 00:04:54.239 which is huge. 00:04:54.240 --> 00:04:57.439 It's not something you would expect 00:04:57.440 --> 00:04:59.599 in a local file running on a computer. 00:04:59.600 --> 00:05:01.879 But I've got what I wanted. Right. 00:05:01.880 --> 00:05:04.119 So there's a scatter plot here, as you can see below, 00:05:04.120 --> 00:05:08.879 which is plotting what I specified it to do, 00:05:08.880 --> 00:05:11.700 though it looks a little dense. NOTE Further instructions 00:05:11.701 --> 00:05:12.640 What I can do is 00:05:12.641 --> 00:05:16.000 I can provide further instructions as feedback. 00:05:16.001 --> 00:05:18.400 I try to feed back on this. So I can say, 00:05:18.401 --> 00:05:30.599 "Can you only show points where year is the multiple of 50?" 00:05:30.600 --> 00:05:33.519 So since it's starting from 1800, the data points, 00:05:33.520 --> 00:05:34.719 there are too many years, 00:05:34.720 --> 00:05:37.239 so I'll just try to thin them down a little. 00:05:37.240 --> 00:05:40.199 Now what's happening in the background 00:05:40.200 --> 00:05:42.719 is that everything below this last instruction 00:05:42.720 --> 00:05:45.719 is going out as the context to the model 00:05:45.720 --> 00:05:47.399 along with the code that it wrote till now. 00:05:47.400 --> 00:05:50.079 And then this instruction is added on top of it 00:05:50.080 --> 00:05:53.079 so that it basically modifies the code to make it work 00:05:53.080 --> 00:05:55.079 according to this instruction. 00:05:55.080 --> 00:05:58.439 As you can see now, the data points are much fewer. 00:05:58.440 --> 00:06:01.519 This is what I wanted also. 00:06:01.520 --> 00:06:02.799 Let's also do a few more things. 00:06:02.800 --> 00:06:05.439 I want to see the progression through time. 00:06:05.440 --> 00:06:13.079 So maybe I'll do something like, color more recent years 00:06:13.080 --> 00:06:15.439 with a darker shade of... 00:06:15.440 --> 00:06:21.719 Let's change the color map also. 00:06:21.720 --> 00:06:24.159 Now, this again goes back to the model. 00:06:24.160 --> 00:06:26.799 Again, everything below before this line 00:06:26.800 --> 00:06:29.119 is the context along with the current code, 00:06:29.120 --> 00:06:31.799 and then this instruction is going to the model 00:06:31.800 --> 00:06:37.039 to make the changes. So now this should happen, I guess. 00:06:37.040 --> 00:06:41.319 Once this happens. Yeah. So. OK. 00:06:41.320 --> 00:06:44.599 So we have this new color map, 00:06:44.600 --> 00:06:46.599 and there's also this change of color. 00:06:46.600 --> 00:06:51.719 And also there's this range of color from 1800 to 2000, 00:06:51.720 --> 00:06:53.399 which is a nice addition. 00:06:53.400 --> 00:06:55.839 Kind of smart. I didn't expect... 00:06:55.840 --> 00:06:58.959 I didn't exactly ask for it, but it's nice. 00:06:58.960 --> 00:07:00.959 So there's a couple more things. 00:07:00.960 --> 00:07:07.759 Let's make it more minimal. "Let's make it more minimal. 00:07:07.760 --> 00:07:17.319 Can you remove the bounding box?" 00:07:17.320 --> 00:07:21.399 Also, let's annotate a few points. 00:07:21.400 --> 00:07:23.719 So I want to annotate the point 00:07:23.720 --> 00:07:25.839 which has the highest GDP per capita. 00:07:25.840 --> 00:07:33.599 "Also annotate the point with highest GDP per capita 00:07:33.600 --> 00:07:36.999 with the country and year." 00:07:37.000 --> 00:07:41.599 So again, forget about the grammar. 00:07:41.600 --> 00:07:43.599 The language model works out well. 00:07:43.600 --> 00:07:46.159 Usually it takes care of 00:07:46.160 --> 00:07:47.439 all those complexities for you. 00:07:47.440 --> 00:07:53.119 This is what we have got after that. 00:07:53.120 --> 00:07:55.719 As you can see, there's the annotation, which is here. 00:07:55.720 --> 00:07:56.679 I think it's still overlapping, 00:07:56.680 --> 00:07:58.559 so probably it could be done better, 00:07:58.560 --> 00:08:00.159 but the box is removed. NOTE Room for improvement 00:08:00.160 --> 00:08:03.359 Now, as you can see, the system is... 00:08:03.360 --> 00:08:04.879 You will be able to see this 00:08:04.880 --> 00:08:07.479 that the system is not really robust. 00:08:07.480 --> 00:08:10.079 So the GitHub repository has some examples 00:08:10.080 --> 00:08:12.119 where it fails miserably, 00:08:12.120 --> 00:08:13.679 and you'll actually have to go into the code 00:08:13.680 --> 00:08:14.999 to figure out what's happening. 00:08:15.000 --> 00:08:17.879 But we do expect that to improve slowly, 00:08:17.880 --> 00:08:21.039 because the models are improving greatly in performance. 00:08:21.040 --> 00:08:22.479 This is a very general model. 00:08:22.480 --> 00:08:24.479 This is not even tuned for this use case. 00:08:24.480 --> 00:08:26.639 The other thing is that 00:08:26.640 --> 00:08:29.639 while I was trying to provide feedback, 00:08:29.640 --> 00:08:32.199 I was still using text here all the time, 00:08:32.200 --> 00:08:34.559 but it can be made more natural. 00:08:34.560 --> 00:08:36.159 So, for example, if I have to annotate 00:08:36.160 --> 00:08:37.439 this particular point, 00:08:37.440 --> 00:08:42.239 I actually can just point my cursor to it. 00:08:42.240 --> 00:08:44.519 Emacs has a way to figure out 00:08:44.520 --> 00:08:45.799 where your mouse pointer is. 00:08:45.800 --> 00:08:49.620 And with that, you can actually go back into the code 00:08:49.621 --> 00:08:51.960 and then see which primitive 00:08:51.961 --> 00:08:54.480 is being drawn here in Matplotlib. 00:08:54.481 --> 00:08:55.719 So that there is a way to do that. 00:08:55.720 --> 00:08:58.439 And then, if you do that, then it's really nice to 00:08:58.440 --> 00:09:01.319 just be able to say 00:09:01.320 --> 00:09:04.279 put your cursor here and then say something like, 00:09:04.280 --> 00:09:04.999 "Can you make this? 00:09:05.000 --> 00:09:06.599 Can you annotate this point?" 00:09:06.600 --> 00:09:10.719 Because text is, you know... There are limitations to text. 00:09:10.720 --> 00:09:12.479 And if you're producing an image, 00:09:12.480 --> 00:09:13.959 you should be able to do that, too. 00:09:13.960 --> 00:09:16.399 So I do expect that to happen soonish. 00:09:16.400 --> 00:09:19.839 If not, from the model side, the hack that I mentioned 00:09:19.840 --> 00:09:21.359 could be made to work. 00:09:21.360 --> 00:09:24.439 So that will come in in a later version, probably. 00:09:24.440 --> 00:09:27.599 Anyway, so that's the end of my talk. 00:09:27.600 --> 00:09:29.759 You can find more details in the repository link. 00:09:29.760 --> 00:09:33.480 Thank you for listening. Goodbye.