diff options
Diffstat (limited to '2025/captions/emacsconf-2025-python--interactive-python-programming-in-emacs--david-vujic--main.vtt')
| -rw-r--r-- | 2025/captions/emacsconf-2025-python--interactive-python-programming-in-emacs--david-vujic--main.vtt | 731 |
1 files changed, 731 insertions, 0 deletions
diff --git a/2025/captions/emacsconf-2025-python--interactive-python-programming-in-emacs--david-vujic--main.vtt b/2025/captions/emacsconf-2025-python--interactive-python-programming-in-emacs--david-vujic--main.vtt new file mode 100644 index 00000000..d63a36c8 --- /dev/null +++ b/2025/captions/emacsconf-2025-python--interactive-python-programming-in-emacs--david-vujic--main.vtt @@ -0,0 +1,731 @@ +WEBVTT captioned by sachac + +00:00:00.000 --> 00:00:04.439 +Okay, so welcome to this session about interactive Python + +00:00:04.440 --> 00:00:09.679 +programming. My name is David Vujic and I live and work in + +00:00:09.680 --> 00:00:15.319 +Stockholm, Sweden. a developer and today I focus + +00:00:15.320 --> 00:00:20.439 +mainly on Python software development. So I do this at work + +00:00:20.440 --> 00:00:25.999 +and I also do this on my spare time in my open source projects. + +00:00:26.000 --> 00:00:30.479 +Before that, I've been part of the Lisp community. I've + +00:00:30.480 --> 00:00:33.700 +been a Clojure developer, and also, like, way back, + +00:00:33.701 --> 00:00:40.279 +I was in the Microsoft world and developed C# and .NET stuff. + +00:00:40.280 --> 00:00:45.999 +What I've been doing lately is to try to improve the + +00:00:46.000 --> 00:00:52.399 +developer experience when you write Python code. So what I + +00:00:52.400 --> 00:00:56.159 +want to talk about is this, but also I want to begin with + +00:00:56.160 --> 00:01:00.839 +feedback loops because I think it's very related to this + +00:01:00.840 --> 00:01:05.359 +interactive programming style, like having this nice + +00:01:05.360 --> 00:01:07.067 +feedback when you write code. + +00:01:07.068 --> 00:01:10.533 +So I'm going to begin with that. + +NOTE Feedback loops + +00:01:10.534 --> 00:01:14.199 +So this image, you know, this circle is supposed to be a + +00:01:14.200 --> 00:01:19.879 +visualization of a feedback loop. Let's say we write our + +00:01:19.880 --> 00:01:25.239 +code and then we deploy it to production. Then when it's + +00:01:25.240 --> 00:01:29.639 +running there, we can check if things work, or if maybe someone + +00:01:29.640 --> 00:01:35.319 +else will let us know. Maybe our customers will let us know. + +00:01:35.320 --> 00:01:39.639 +That's a pretty slow feedback loop with potential risks of + +00:01:39.640 --> 00:01:41.867 +damaging your business or whatever. + +00:01:41.868 --> 00:01:44.167 +This is obvious, of course. + +00:01:44.168 --> 00:01:50.000 +So a faster feedback loop probably is to have + +00:01:50.001 --> 00:01:54.066 +some kind of automation when you do commits + +00:01:54.067 --> 00:01:59.733 +or maybe you have this pull request things and even reviews. + +00:01:59.734 --> 00:02:02.933 +So maybe not always as fast as deploy, + +00:02:02.934 --> 00:02:05.839 +don't deploy directly to production, but + +00:02:05.840 --> 00:02:10.539 +it's probably safer and often you get this automated + +00:02:10.540 --> 00:02:16.199 +feedback faster anyway. But it's still kind of slow. You + +00:02:16.200 --> 00:02:20.239 +have to wait. You have to push things to GitHub maybe and + +00:02:20.240 --> 00:02:24.279 +wait. So there's faster ways for sure to get feedback. + +00:02:24.280 --> 00:02:27.967 +So a much faster way is to write code, + +00:02:27.968 --> 00:02:31.367 +and write some unit tests, and run those unit tests. + +00:02:31.368 --> 00:02:33.467 +So then you do everything on your local machine + +00:02:33.468 --> 00:02:39.039 +and you will fairly quickly learn if your code does + +00:02:39.040 --> 00:02:47.159 +what you think it does or if it doesn't. I want to zoom in to + +00:02:47.160 --> 00:02:55.999 +this test write code and test flow a bit. Let's do that. + +NOTE Test-driven development + +00:02:56.000 --> 00:02:59.759 +As a developer, I have used a thing called test-driven + +00:02:59.760 --> 00:03:05.999 +development for quite some time. I find that this way of + +00:03:06.000 --> 00:03:11.259 +working is very fast when it comes to getting feedback on + +00:03:11.260 --> 00:03:14.519 +what your code does and how you should continue the + +00:03:14.520 --> 00:03:19.980 +development. So, test-driven development, + +00:03:19.981 --> 00:03:24.220 +basically that you start writing a test for + +00:03:24.221 --> 00:03:27.020 +something that you want to develop, and then you continue + +00:03:27.021 --> 00:03:31.019 +developing that, and then you go back to the test, and modify + +00:03:31.020 --> 00:03:35.079 +and modify the code, and you go back and forth between the + +00:03:35.080 --> 00:03:36.959 +tests and the code. + +00:03:36.960 --> 00:03:44.419 +It's sort of like a ping-pong game. I find this very + +00:03:44.420 --> 00:03:50.519 +effective when you want to get feedback and to know how to + +00:03:50.520 --> 00:03:57.233 +continue the development. The most important thing + +00:03:57.234 --> 00:04:01.700 +that I feel is that you know what the code does. + +00:04:01.701 --> 00:04:05.559 +You learn very quickly. + +NOTE REPL-driven development + +00:04:05.560 --> 00:04:12.199 +Let's zoom into this TDD flow a little bit. The last couple of + +00:04:12.200 --> 00:04:17.379 +years, I've been doing a slightly different thing which is + +00:04:17.380 --> 00:04:21.979 +called REPL-driven development. REPL-driven + +00:04:21.980 --> 00:04:25.719 +development is very similar to test-driven development, + +00:04:25.720 --> 00:04:31.159 +but I find it even quicker. You get feedback even quicker + +00:04:31.160 --> 00:04:34.979 +than with a regular TDD setup. So REPL-driven development + +00:04:34.980 --> 00:04:41.199 +is about writing and evaluating code in a REPL basically. + +00:04:41.200 --> 00:04:46.839 +And you can do experiments and you can refactor and + +00:04:46.840 --> 00:04:51.699 +re-evaluate and you get instant feedback on what the code + +00:04:51.700 --> 00:04:54.799 +does and what you need to change. So I think that's even + +00:04:54.800 --> 00:04:59.519 +faster than test-driven development. + +00:04:59.520 --> 00:05:02.899 +Okay, REPL driven development. Let's go back. What's the + +00:05:02.900 --> 00:05:10.759 +REPL? Most of developers know what a REPL is. The most common + +00:05:10.760 --> 00:05:16.399 +setup is you open this shell and you use the REPL for your + +00:05:16.400 --> 00:05:19.359 +programming language. In this case I'm using the Python + +00:05:19.360 --> 00:05:25.619 +REPL or the IPython REPL which is an enhanced REPL for Python + +00:05:25.620 --> 00:05:30.679 +development. So what happens here is that we start a REPL + +00:05:30.680 --> 00:05:34.919 +session in isolation. So this session knows about the + +00:05:34.920 --> 00:05:38.119 +Python environment. So it knows about the Python language + +00:05:38.120 --> 00:05:42.359 +basically. So as soon as we start writing things, adding + +00:05:42.360 --> 00:05:47.359 +variables or creating writing functions or even doing + +00:05:47.360 --> 00:05:51.679 +imports. Then the session will be more and more aware of the + +00:05:51.680 --> 00:05:55.819 +code so we will add things to the to the session and then that + +00:05:55.820 --> 00:06:00.519 +means that we can run functions we can print out these + +00:06:00.520 --> 00:06:05.859 +variables and things like that. But with REPL driven + +00:06:05.860 --> 00:06:09.839 +development it's not really that well at least not what I + +00:06:09.840 --> 00:06:14.039 +mean with REPL driven development. So what I'm thinking of + +00:06:14.040 --> 00:06:19.639 +is that you are in your code editor where you have your + +00:06:19.640 --> 00:06:22.799 +autocomplete, and you have your syntax highlighting and + +00:06:22.800 --> 00:06:30.459 +your favorite theme, color theme, and all of those things. But + +00:06:30.460 --> 00:06:34.979 +instead, you have this running REPL in the background or in a + +00:06:34.980 --> 00:06:41.139 +smaller window or buffer. So that means that you write code + +00:06:41.140 --> 00:06:45.319 +and you can send that code to the running REPL, to the REPL + +00:06:45.320 --> 00:06:50.399 +session. You write and do everything as you would do when + +00:06:50.400 --> 00:06:55.219 +writing your code basically. In this case, in this + +00:06:55.220 --> 00:07:00.599 +example, I have evaluated these two functions. I've sent + +00:07:00.600 --> 00:07:05.819 +them to the REPL session so it's aware of these functions. + +00:07:05.820 --> 00:07:10.399 +Then I switched to a separate different module and + +00:07:10.400 --> 00:07:14.039 +evaluated that one. So the REPL session now knows about + +00:07:14.040 --> 00:07:19.039 +these two functions and also these two variables. That + +00:07:19.040 --> 00:07:23.999 +means that I can evaluate the state of those variables and + +00:07:24.000 --> 00:07:28.999 +change code and re-evaluate and things like that. So in this + +00:07:29.000 --> 00:07:33.639 +example if you look in the smaller area there you see that I + +00:07:33.640 --> 00:07:39.639 +have evaluated this res variable on line 6 and the output was + +00:07:39.640 --> 00:07:42.399 +that it's a dictionary with two keys and two values + +00:07:42.400 --> 00:07:51.219 +basically. So this setup works in basically any of your + +00:07:51.220 --> 00:07:54.079 +favorite code editors. So you can do this in Visual Studio + +00:07:54.080 --> 00:08:01.239 +Code, you can do this in PyCharm or Vim. But what I have done is + +00:08:01.240 --> 00:08:07.119 +that... More like what I have missed is that when I write code + +00:08:07.120 --> 00:08:10.239 +and do this evaluation, this is really cool, but then I need + +00:08:10.240 --> 00:08:15.459 +to switch context if I want to see the result. I have to switch + +00:08:15.460 --> 00:08:21.979 +context to this other window. I + +00:08:21.980 --> 00:08:25.759 +have my focus on the code and then I have to look in a different + +00:08:25.760 --> 00:08:31.799 +place to know the results. And if it's a larger output, then + +00:08:31.800 --> 00:08:37.479 +maybe I need to scroll. So I wanted to find out if it was + +00:08:37.480 --> 00:08:43.479 +possible to make this even smoother and faster, this + +00:08:43.480 --> 00:08:45.479 +feedback loop even faster, so I don't have to switch + +00:08:45.480 --> 00:08:52.119 +context. What I've done here is that... I can select a row or a + +00:08:52.120 --> 00:08:58.079 +region and I can evaluate and then an overlay, a small pop-up + +00:08:58.080 --> 00:09:03.119 +shows up with the evaluated result right next to it. So I can + +00:09:03.120 --> 00:09:07.519 +change code and re-evaluate and quickly see the result of it + +00:09:07.520 --> 00:09:12.640 +without doing this context switching. So the way I've done + +00:09:12.641 --> 00:09:20.679 +it is that I wanted to reuse the existing tooling that I + +00:09:20.680 --> 00:09:27.739 +already had. I know that my in-editor REPL, the IPython + +00:09:27.740 --> 00:09:31.559 +REPL, already does this evaluation. So I figured maybe I can + +00:09:31.560 --> 00:09:35.359 +extract the data and do this visualization as a separate + +00:09:35.360 --> 00:09:40.839 +thing. That's how I've done it. What I've done is that + +00:09:40.840 --> 00:09:47.199 +I've created this overlay and placed it where my cursor + +00:09:47.200 --> 00:09:50.859 +currently is, right next to the code. Then I've + +00:09:50.860 --> 00:09:55.719 +extracted the evaluated result and put it in this overlay. + +00:09:55.720 --> 00:10:01.039 +I also want this overlay to have this nice looking syntax, + +00:10:01.040 --> 00:10:04.759 +so I've set it to this Python mode, so we get this syntax + +00:10:04.760 --> 00:10:10.559 +highlighting. Make it look very readable. And as a nice + +00:10:10.560 --> 00:10:16.879 +developer experience thing, + +00:10:16.880 --> 00:10:20.379 +when you move the cursor, of course you don't want the + +00:10:20.380 --> 00:10:25.679 +overlay to be there. You want it to disappear. So those kinds + +00:10:25.680 --> 00:10:28.999 +of things I've added. So putting the overlay at the right + +00:10:29.000 --> 00:10:33.279 +place and feed it with the evaluated data and then make it + +00:10:33.280 --> 00:10:39.839 +disappear when it's not interesting to look at anymore. + +00:10:39.840 --> 00:10:44.639 +What I've described so far is something that I use on a + +00:10:44.640 --> 00:10:50.639 +daily basis, and it covers most of my needs while doing Python + +00:10:50.640 --> 00:10:56.119 +development. But one thing I still miss, and I miss it from my + +00:10:56.120 --> 00:11:03.479 +days as a Clojure developer, because over there we could + +00:11:03.480 --> 00:11:07.919 +have a running app on our local machine and we can have our + +00:11:07.920 --> 00:11:12.719 +editor, and the app and the editor were connected. So when I + +00:11:12.720 --> 00:11:17.199 +did some changes in the code, the app would change without + +00:11:17.200 --> 00:11:20.559 +any restarts or anything like that. And the same if I would + +00:11:20.560 --> 00:11:24.679 +change the state of the app, I can inspect the state from the + +00:11:24.680 --> 00:11:28.919 +code. So they were connected. They are connected. So I was + +00:11:28.920 --> 00:11:32.839 +thinking, hey, this would be really cool if we could have + +00:11:32.840 --> 00:11:39.199 +something like this in Python. And that reminded me of + +00:11:39.200 --> 00:11:43.839 +Jupyter and Jupyter notebooks because I think notebooks, + +00:11:43.840 --> 00:11:49.659 +the way you do things there, is very similar to what I was + +00:11:49.660 --> 00:11:56.879 +trying to achieve. So I was reading up a little bit on how this + +00:11:56.880 --> 00:12:00.919 +notebook thing works. It turns out that a notebook is a + +00:12:00.920 --> 00:12:05.279 +client that talks to a server, that communicates with a + +00:12:05.280 --> 00:12:08.799 +server. It's on the server that all this Python + +00:12:08.800 --> 00:12:14.159 +evaluation and all this thing happens. Then what I've + +00:12:14.160 --> 00:12:19.659 +done is that instead of starting up IPython in my editor, I + +00:12:19.660 --> 00:12:23.519 +start the Jupyter console instead. And then I can give it + +00:12:23.520 --> 00:12:27.159 +that unique ID and it will be connected to that running + +00:12:27.160 --> 00:12:30.919 +kernel. + +NOTE FastAPI CRUD + +00:12:30.920 --> 00:12:37.199 +In this example, I've created this FastAPI CRUD app that + +00:12:37.200 --> 00:12:41.919 +has this create, read, update, and delete endpoints. It + +00:12:41.920 --> 00:12:46.399 +has this, it's locally running, it has this database where + +00:12:46.400 --> 00:12:51.639 +you can do all these things. I'm running this FastAPI app + +00:12:51.640 --> 00:12:58.059 +in the kernel and then I've connected to, I've connected to + +00:12:58.060 --> 00:13:03.239 +the kernel in my editor too. Both of them are connected to + +00:13:03.240 --> 00:13:09.719 +the kernel. What I do now is that I want to initially create + +00:13:09.720 --> 00:13:15.239 +some data. I'm going to add this, creating this message. + +00:13:15.240 --> 00:13:19.899 +What I get back is a message ID. I want to experiment in + +00:13:19.900 --> 00:13:24.359 +my browser. What do I get with that message ID? I'm + +00:13:24.360 --> 00:13:30.239 +evaluating the read function. I instantly get this + +00:13:30.240 --> 00:13:34.779 +evaluated result, which was this hello world text. So what + +00:13:34.780 --> 00:13:39.919 +happens if I do some changes in this app? I'm going to grab + +00:13:39.920 --> 00:13:49.659 +this message ID and write something else. + +00:13:49.660 --> 00:13:53.759 +Now I can evaluate the same thing again, and you can see that + +00:13:53.760 --> 00:14:02.399 +the content has changed to this new value. My editor isn't + +00:14:02.400 --> 00:14:07.719 +in any debug mode or something like that. It doesn't know + +00:14:07.720 --> 00:14:11.239 +what database it is. It doesn't have any environment + +00:14:11.240 --> 00:14:14.479 +variables set up or something like that. It is only + +00:14:14.480 --> 00:14:17.599 +connected to the kernel, and the kernel is aware of that. It's + +00:14:17.600 --> 00:14:20.479 +running the app. It has the connection strings and + +00:14:20.480 --> 00:14:28.799 +everything that is needed. So that's how this thing works. + +00:14:28.800 --> 00:14:34.199 +Now I want to do some inline hacking because I want to store + +00:14:34.200 --> 00:14:37.799 +this input that is sent from this app because I want to work + +00:14:37.800 --> 00:14:42.039 +with it afterwards. I can add this dictionary that stores + +00:14:42.040 --> 00:14:48.759 +this message. I'm updating the source code of this app, and + +00:14:48.760 --> 00:15:03.079 +when I run any of these endpoints again, you will see that + +00:15:03.080 --> 00:15:08.759 +the state changes, and the new inputs, I can grab and I can use + +00:15:08.760 --> 00:15:14.399 +them for quick evaluation or testing. This example is + +00:15:14.400 --> 00:15:18.519 +really simple. It was just an integer. For example, if you + +00:15:18.520 --> 00:15:23.519 +are sending a more complex object, maybe a pydantic schema + +00:15:23.520 --> 00:15:28.199 +or something, and you want to inspect what's coming in, and if + +00:15:28.200 --> 00:15:34.199 +you have some sort of validation that you want to test out. + +00:15:34.200 --> 00:15:38.399 +The configuration or the code that I wrote to make this work + +00:15:38.400 --> 00:15:44.159 +is a little bit different than just adding an overlay. I'm + +00:15:44.160 --> 00:15:50.999 +using this overlay just like with the IPython example, but in + +00:15:51.000 --> 00:15:57.839 +this case, when I change code, I have to think about where that + +00:15:57.840 --> 00:16:02.159 +code lives, because it's the app that runs the code. So it's + +00:16:02.160 --> 00:16:07.039 +in the app context I need to manipulate with the data. If you + +00:16:07.040 --> 00:16:11.919 +have started the app from maybe a main function and that + +00:16:11.920 --> 00:16:17.879 +module imports namespaces, then you need to, if you want to + +00:16:17.880 --> 00:16:22.359 +update a function or something like that, you need to update + +00:16:22.360 --> 00:16:26.679 +it in the correct namespace. What I did before in IPython + +00:16:26.680 --> 00:16:29.919 +by adding and changing things, everything ends up in the + +00:16:29.920 --> 00:16:34.439 +global namespace. But here, if you want the app to actually + +00:16:34.440 --> 00:16:38.479 +react to the changes, you need to put it in the right + +00:16:38.480 --> 00:16:43.479 +namespace. So that's what I do here. I do some lookups, where + +00:16:43.480 --> 00:16:49.139 +is this function, and then I do this reload of this function or + +00:16:49.140 --> 00:16:54.799 +module. And when I was developing this, I was thinking, hey, + +00:16:54.800 --> 00:16:59.319 +this is really ugly. I'm in this REPL and do some + +00:16:59.320 --> 00:17:03.559 +manipulation of the imports and things like that. That + +00:17:03.560 --> 00:17:09.759 +didn't feel good. Then I was reminded of the IPython. And + +00:17:09.760 --> 00:17:15.519 +IPython has this feature to reload any updated + +00:17:15.520 --> 00:17:19.119 +submodules. I was curious how do they do it. I looked in the + +00:17:19.120 --> 00:17:24.079 +IPython source code and saw that they also use importlib and + +00:17:24.080 --> 00:17:28.359 +reloading of this module. Once I've learned that, then I + +00:17:28.360 --> 00:17:32.599 +stopped thinking that my code was hacky. I thought it was + +00:17:32.600 --> 00:17:37.159 +good enough at least. + +NOTE Testing with an LLM + +00:17:37.160 --> 00:17:45.059 +But one thing that has bothered me for a long time is I quite + +00:17:45.060 --> 00:17:50.199 +often want to test out and evaluate individual rows that + +00:17:50.200 --> 00:17:58.559 +lives in a function. Quite often, this code uses the input + +00:17:58.560 --> 00:18:02.639 +to that function like the input parameters. To be able to + +00:18:02.640 --> 00:18:07.719 +do that, I need to manually type some fake data and set it to + +00:18:07.720 --> 00:18:12.279 +this variable, and then I can evaluate the code. But I think + +00:18:12.280 --> 00:18:17.779 +that takes... That slows me down. I was thinking, maybe I can + +00:18:17.780 --> 00:18:23.439 +do this in a quicker way, so I have this quicker feedback, so I + +00:18:23.440 --> 00:18:27.933 +can run this or evaluate this code much quicker. + +00:18:27.934 --> 00:18:29.439 +So my idea was maybe I + +00:18:29.440 --> 00:18:35.239 +can use an LLM for this. If I give it the parameters, maybe it + +00:18:35.240 --> 00:18:41.119 +can return some random data so I don't have to write it + +00:18:41.120 --> 00:18:44.119 +myself. I ended up doing that. I have this source code. + +00:18:44.120 --> 00:18:50.399 +I'm loading the REPL with the code. Then I select this + +00:18:50.400 --> 00:18:56.719 +function name and the parameters with its data type. I + +00:18:56.720 --> 00:19:02.839 +have this prompt that instructs the LLM to come up with fake + +00:19:02.840 --> 00:19:06.239 +data based on the tag name and on the data type. And then I can + +00:19:06.240 --> 00:19:10.099 +send that to the REPL. I do that with a key command. Then + +00:19:10.100 --> 00:19:16.019 +I can proceed by running the code within the function that + +00:19:16.020 --> 00:19:21.719 +uses these inputs. This works for all the data types. If + +00:19:21.720 --> 00:19:26.279 +there's a custom data type, you need to give the LLM extra + +00:19:26.280 --> 00:19:30.399 +context. So that's something to think about. Once it knows + +00:19:30.400 --> 00:19:35.679 +the context, it can generate this fake data that very often is + +00:19:35.680 --> 00:19:39.839 +good enough just to test out, you know, like I've done here, like + +00:19:39.840 --> 00:19:45.399 +string... sorry, list destructuring and parsing and things + +00:19:45.400 --> 00:19:51.879 +like that. I think that was all I had, and thank you for + +00:19:51.880 --> 00:19:52.920 +listening! |
