WEBVTT captioned by sachac, checked by sachac NOTE Introduction 00:00:03.120 --> 00:00:07.439 Hi everyone! I'm Mats Lidell. 00:00:07.440 --> 00:00:09.879 I'm going to talk about my journey 00:00:09.880 --> 00:00:12.480 writing test cases for GNU Hyperbole 00:00:12.481 --> 00:00:19.399 and what I learned on the way. 00:00:19.400 --> 00:00:24.079 So, why write tests for GNU Hyperbole? 00:00:24.080 --> 00:00:25.679 There is some background. 00:00:25.680 --> 00:00:27.959 I'm the co-maintainer of GNU Hyperbole 00:00:27.960 --> 00:00:33.479 together with Bob Weiner. Bob is the author of the package. 00:00:33.480 --> 00:00:34.680 The package is available through 00:00:34.681 --> 00:00:38.799 the Emacs package manager and GNU Elpa 00:00:38.800 --> 00:00:42.599 if you would want to try it out. 00:00:42.600 --> 00:00:46.359 The package has some age. I think it dates back to 00:00:46.360 --> 00:00:50.119 a first release around 1993, which is also 00:00:50.120 --> 00:00:54.799 when I got in contact with the package the first time. 00:00:54.800 --> 00:00:58.239 I was a user of the package for many years. 00:00:58.240 --> 00:01:03.119 Later, I became the maintainer of the package for the FSF. 00:01:03.120 --> 00:01:04.679 That was although I did not have 00:01:04.680 --> 00:01:09.039 much knowledge of Emacs Lisp, 00:01:09.040 --> 00:01:12.679 and I still have a lot to learn. 00:01:12.680 --> 00:01:15.959 A few years ago, we started to work actively on the package, 00:01:15.960 --> 00:01:20.839 with setting up goals and having meetings. 00:01:20.840 --> 00:01:24.959 So my starting point is that I had experience 00:01:24.960 --> 00:01:27.439 with test automation from development 00:01:27.440 --> 00:01:30.599 in C++, Java and Python 00:01:30.600 --> 00:01:37.239 using different x-unit frameworks like cppunit, junit. 00:01:37.240 --> 00:01:40.039 That was in my daytime work where 00:01:40.040 --> 00:01:41.959 the technique of using pull requests 00:01:41.960 --> 00:01:46.719 with changes backed up by tests were the daily routine. 00:01:46.720 --> 00:01:49.199 It was really a requirement for a change to go in 00:01:49.200 --> 00:01:52.159 to have supporting test cases. 00:01:52.160 --> 00:01:58.559 I believe, a quite common setup and requirement these days. 00:01:58.560 --> 00:02:02.039 I also had been an Emacs user for many years, 00:02:02.040 --> 00:02:04.279 but with focus on being a user. 00:02:04.280 --> 00:02:09.839 So as I mentioned, I have limited Emacs Lisp knowledge. 00:02:09.840 --> 00:02:11.359 When we decided to start 00:02:11.360 --> 00:02:13.959 to work actively on Hyperbole again, 00:02:13.960 --> 00:02:15.519 it was natural for me to look into 00:02:15.520 --> 00:02:18.679 raising the quality by adding unit tests. 00:02:18.680 --> 00:02:20.679 This also goes hand in hand 00:02:20.680 --> 00:02:25.239 with running these regularly as part of a build process. 00:02:25.240 --> 00:02:28.439 All in all, following the current best practice 00:02:28.440 --> 00:02:31.359 of software development. 00:02:31.360 --> 00:02:36.479 But since Hyperbole had no tests at all, 00:02:36.480 --> 00:02:38.719 it would not be enough just to add tests 00:02:38.720 --> 00:02:41.799 for new or changed functionality. 00:02:41.800 --> 00:02:44.639 We wanted to add it even broader; ideally, everywhere. 00:02:44.640 --> 00:02:48.399 So work started with adding tests here and there 00:02:48.400 --> 00:02:52.039 based on our gut feeling where it would be most useful. 00:02:52.040 --> 00:02:55.799 This work is still ongoing. 00:02:55.800 --> 00:02:58.119 So this is where my journey starts 00:02:58.120 --> 00:03:00.759 with much functionality to test, 00:03:00.760 --> 00:03:03.359 no knowledge of what testing frameworks existed, 00:03:03.360 --> 00:03:11.159 and not really knowing a lot about Emacs Lisp at all. NOTE ERT: Emacs Lisp Regression Testing 00:03:11.160 --> 00:03:13.799 Luckily there is a package for writing tests in Emacs. 00:03:13.800 --> 00:03:17.919 It is called ERT: Emacs Lisp Regression Testing. 00:03:17.920 --> 00:03:20.959 It contains both support for defining tests and running them. 00:03:20.960 --> 00:03:24.639 Defining a test is done with the macro `ert-deftest`. 00:03:24.640 --> 00:03:28.919 In its simplest form, a test has a name, a doc string, and a body. 00:03:28.920 --> 00:03:31.439 The doc string is where you typically can give 00:03:31.440 --> 00:03:33.799 a detailed description of the test 00:03:33.800 --> 00:03:35.559 and has space for more info 00:03:35.560 --> 00:03:42.279 than what can be given in the test name. 00:03:42.280 --> 00:03:45.239 The body is where all the interesting things happen. 00:03:45.240 --> 00:03:51.959 It is here you prepare the test, run it and verify the outcome. 00:03:51.960 --> 00:03:54.239 Schematically, it looks like this. 00:03:54.240 --> 00:04:00.239 You have the ert-deftest, you have the test name, 00:04:00.240 --> 00:04:02.799 and the doc string, and then the body. 00:04:02.800 --> 00:04:06.559 It is in the body where everything interesting happens. 00:04:06.560 --> 00:04:09.759 The test is prepared, the function of the test is executed, 00:04:09.760 --> 00:04:13.119 and the outcome of the test is evaluated. 00:04:13.120 --> 00:04:14.359 Did the test succeed or not? NOTE Assertions with `should` 00:04:14.360 --> 00:04:18.479 The verification of a test is performed with 00:04:18.480 --> 00:04:21.479 one or more so-called assertions. 00:04:21.480 --> 00:04:24.999 In ERT, they are implemented 00:04:25.000 --> 00:04:26.599 with the macro `should` 00:04:26.600 --> 00:04:33.559 together with a set of related macros. 00:04:33.560 --> 00:04:35.519 `should` takes a form as argument, 00:04:35.520 --> 00:04:37.839 and if the form evaluates to nil, 00:04:37.840 --> 00:04:48.580 the test has failed. So let's look at an example. 00:04:48.581 --> 00:04:51.919 This simple test verifies that the function `+` 00:04:51.920 --> 00:04:56.919 can add the numbers 2 and 3 and get the result 5. NOTE Running a test case 00:04:56.920 --> 00:05:01.959 So now we have defined a test case. How do we run it? 00:05:01.960 --> 00:05:03.919 The ERT package has the function (or 00:05:03.920 --> 00:05:09.519 rather convenience alias) `ert`. It takes a test selector. 00:05:09.520 --> 00:05:19.759 The test name works as a selector for running just one test. 00:05:19.760 --> 00:05:27.900 So here we have the example. Let's evaluate it. 00:05:27.901 --> 00:05:34.519 We define it and then we run it using ERT. 00:05:34.520 --> 00:05:42.399 As you see, we get prompted for a test selector 00:05:42.400 --> 00:05:46.319 but we only have one test case defined at the moment. 00:05:46.320 --> 00:05:55.919 It's the example 0. So let's hit RET. 00:05:55.920 --> 00:05:58.959 As you see here, we get some output 00:05:58.960 --> 00:06:01.359 describing what we have just done. 00:06:01.360 --> 00:06:04.839 There is one test case it has passed, zero failed, 00:06:04.840 --> 00:06:07.839 zero skipped, total 1 of 1 test case 00:06:07.840 --> 00:06:14.439 and some time stamps for the execution. 00:06:14.440 --> 00:06:18.519 We also see this green mark here indicating one test case 00:06:18.520 --> 00:06:23.039 and that it was successful. 00:06:23.040 --> 00:06:29.659 For inspecting the test, we can hit the letter `l` 00:06:29.660 --> 00:06:32.839 which shows all the `should` forms 00:06:32.840 --> 00:06:37.779 that was executed during this test case. 00:06:37.780 --> 00:06:39.919 So here we see that we have the `should`, 00:06:39.920 --> 00:06:47.999 one `should` executed, and we see the form equals to 2, 00:06:48.000 --> 00:06:49.799 and it was 5 equals to 5. 00:06:49.800 --> 00:06:54.559 So a good example of a successful test case. NOTE Debug a test 00:06:54.560 --> 00:06:57.919 So now we've seen how we can run a test case. 00:06:57.920 --> 00:07:03.799 Can we debug it? Yes. For debugging a test case, 00:07:03.800 --> 00:07:07.939 the `ert-deftest` can be set up using `edebug-defun`, 00:07:07.940 --> 00:07:10.319 just as a function or macro is set up 00:07:10.320 --> 00:07:18.819 or instrumented for debugging. So let's try that. 00:07:18.820 --> 00:07:24.119 So we try `edebug-defun` here. 00:07:24.120 --> 00:07:28.279 Now it's instrumented for debugging. 00:07:28.280 --> 00:07:35.659 And we run it, `ert`, and we're inside the debugger, 00:07:35.660 --> 00:07:40.679 and we can inspect here what's happening. 00:07:40.680 --> 00:07:46.960 Step through it and yes it succeeded just as before. NOTE Commercial break: Hyperbole 00:07:50.380 --> 00:07:56.879 It's time for a commercial break! 00:07:56.880 --> 00:08:00.079 Hyperbole itself can help with running tests 00:08:00.080 --> 00:08:03.639 and also help with running them in debug mode. 00:08:03.640 --> 00:08:08.519 That is because hyperbole identifies the `ert-deftest` 00:08:08.520 --> 00:08:12.679 as an implicit button. An implicit button is basically 00:08:12.680 --> 00:08:13.759 a string or pattern 00:08:13.760 --> 00:08:16.799 that Hyperbole has assigned some meaning to. 00:08:16.800 --> 00:08:19.959 For the string `ert-deftest`, it is to run the test case. 00:08:19.960 --> 00:08:24.559 You activate the button with the action-key. 00:08:24.560 --> 00:08:27.079 The standard binding is the middle mouse button, 00:08:27.080 --> 00:08:33.040 or from the keyboard, M-RET. 00:08:33.041 --> 00:08:34.799 So let's try that. 00:08:34.800 --> 00:08:42.219 We move the cursor here and then we type M-RET. 00:08:42.220 --> 00:08:47.959 And boom, the test case was executed. 00:08:47.960 --> 00:08:54.479 And to run it in debug mode we type C-u M-RET 00:08:54.480 --> 00:08:57.719 to get the assist key, and then we're in the debugger. 00:08:57.720 --> 00:09:10.479 So that's pretty useful and convenient. NOTE Instrument function on the fly 00:09:10.480 --> 00:09:13.719 A related useful feature here is the step-in functionality 00:09:13.720 --> 00:09:16.399 bound to the letter i in `debug-mode`. 00:09:16.400 --> 00:09:18.119 It allows you to step into a function 00:09:18.120 --> 00:09:20.479 and continue debugging from there. 00:09:20.480 --> 00:09:22.839 For the cases where your test does not do what you want, 00:09:22.840 --> 00:09:25.119 looking at what happens in the function of the test 00:09:25.120 --> 00:09:37.259 can be really useful. Let's try that with another example. 00:09:37.260 --> 00:09:43.359 So here we have two helper functions, one `f1-add`, 00:09:43.360 --> 00:09:47.439 that use the built-in `+` function 00:09:47.440 --> 00:09:52.239 and then we have `my-add` that uses that function. 00:09:52.240 --> 00:09:59.399 So we're going to test myadd. 00:09:59.400 --> 00:10:02.919 And then let's run this. 00:10:02.920 --> 00:10:05.959 Let's run this using hyperbole in debug mode 00:10:05.960 --> 00:10:10.079 C-u M-RET. We're in the debugger again, 00:10:10.080 --> 00:10:15.639 and let's step up front to my function under test 00:10:15.640 --> 00:10:19.359 and then press `i` for getting it instrumented 00:10:19.360 --> 00:10:23.019 and going into it for debugging. 00:10:23.020 --> 00:10:25.139 And here we can expect that it's getting 00:10:25.140 --> 00:10:26.559 the arguments 1 and 3, 00:10:26.560 --> 00:10:30.999 and it returns the result 4 as expected. 00:10:31.000 --> 00:10:39.119 And yes, of course, our test case will then succeed. NOTE Mocking 00:10:39.120 --> 00:10:41.839 The next tool in our toolbox is mocking. 00:10:41.840 --> 00:10:46.239 Mocking is needed when we want to simulate the response 00:10:46.240 --> 00:10:49.279 from a function used by the function under test. 00:10:49.280 --> 00:10:53.139 That is the implementation of the function. 00:10:53.140 --> 00:10:56.119 This could be for various reasons. 00:10:56.120 --> 00:11:00.879 One example could be because it would be hard or impossible 00:11:00.880 --> 00:11:04.199 in the test setup to get the behavior you want to test for, 00:11:04.200 --> 00:11:06.279 like an external error case. 00:11:06.280 --> 00:11:08.679 But the mock can also be used to verify 00:11:08.680 --> 00:11:11.619 that the function is called with a specific argument. 00:11:11.620 --> 00:11:14.559 We can view it as a way to isolate the function on the test 00:11:14.560 --> 00:11:16.719 from its dependencies. 00:11:16.720 --> 00:11:18.959 So in order to test the function in isolation, 00:11:18.960 --> 00:11:22.079 we need to cut out any dependencies to external behavior. 00:11:22.080 --> 00:11:25.839 Most obvious would be dependencies to external resources, 00:11:25.840 --> 00:11:27.639 such as web pages. As an example: 00:11:27.640 --> 00:11:30.639 Hyperbole contains functionality to link you to 00:11:30.640 --> 00:11:34.239 social media resources and other resources on the net. 00:11:34.240 --> 00:11:37.899 Testing that would require the test system to call out 00:11:37.900 --> 00:11:39.639 to the social media resources 00:11:39.640 --> 00:11:43.539 and would depend on it being available, etc. 00:11:43.540 --> 00:11:45.479 Nothing technically stops a test case 00:11:45.480 --> 00:11:47.239 to depend on the external resources, 00:11:47.240 --> 00:11:51.319 but would, if nothing else, be flaky or slow. 00:11:51.320 --> 00:11:53.759 It could be part of an end-to-end suite 00:11:53.760 --> 00:11:57.179 where we want to test that it works all the way. 00:11:57.180 --> 00:11:59.719 In this case, we want to look at the isolated case 00:11:59.720 --> 00:12:04.099 that can be run with no dependency on external resources. 00:12:04.100 --> 00:12:06.679 What you want to do is to replace the function with a mock 00:12:06.680 --> 00:12:10.339 that behaves as the real function would do. 00:12:10.340 --> 00:12:11.639 The package I have found 00:12:11.640 --> 00:12:14.319 and have used for mocking is `el-mock`. 00:12:14.320 --> 00:12:21.839 The workhorse in this package is the `with-mock` macro. 00:12:21.840 --> 00:12:26.519 It looks like this: `with-mock` followed by a body. 00:12:26.520 --> 00:12:30.439 In the execution of the body, stubs and mocks 00:12:30.440 --> 00:12:32.899 defined in the body is respected. 00:12:32.900 --> 00:12:39.199 Let's look at some examples to make that clearer. 00:12:39.200 --> 00:12:42.079 In this case, we have the macro `with-mock`. 00:12:42.080 --> 00:12:43.959 It works so that the expression 00:12:43.960 --> 00:12:48.639 `stub + => 10` is interpreted 00:12:48.640 --> 00:12:51.919 so that the function `+` will be replaced with the stub. 00:12:51.920 --> 00:12:56.779 The stub will return 10 regardless how it is called. 00:12:56.780 --> 00:12:58.119 Note that the stub function 00:12:58.120 --> 00:13:00.199 does not have to be called at this level 00:13:00.200 --> 00:13:02.799 but could be called at any level in the call chain. 00:13:02.800 --> 00:13:07.479 By knowing how the function under test is implemented 00:13:07.480 --> 00:13:09.319 and how the implementation works, 00:13:09.320 --> 00:13:11.959 you can find function calls you want to mock 00:13:11.960 --> 00:13:14.999 to force certain behavior that you want to test, 00:13:15.000 --> 00:13:18.999 or to avoid calls to external resources, slow calls, etc. 00:13:19.000 --> 00:13:21.959 Simply isolate the function under test 00:13:21.960 --> 00:13:26.119 and simulate its environment. 00:13:26.120 --> 00:13:28.639 Mock is a little bit more sophisticated 00:13:28.640 --> 00:13:30.079 and depends on the arguments 00:13:30.080 --> 00:13:31.479 that the mock function is called with. 00:13:31.480 --> 00:13:33.847 Or more precise, it is checked 00:13:33.848 --> 00:13:35.519 after the `with-mock` clause 00:13:35.520 --> 00:13:38.079 that the arguments match the arguments it was called with 00:13:38.080 --> 00:13:39.759 or even if it was called at all. 00:13:39.760 --> 00:13:41.839 If it is called with other arguments 00:13:41.840 --> 00:13:43.719 there will be an error, 00:13:43.720 --> 00:13:46.479 and if it's not called, it is also an error. 00:13:46.480 --> 00:13:48.359 So this way, we are sure that the function 00:13:48.360 --> 00:13:51.319 we were expected to be called actually was called. 00:13:51.320 --> 00:13:53.399 An important piece of the testing. 00:13:53.400 --> 00:13:56.239 So we are sure that the mock we have provided 00:13:56.240 --> 00:14:03.999 actually is triggered by the test case. 00:14:04.000 --> 00:14:08.159 So here we have an example of `with-mock` 00:14:08.160 --> 00:14:18.879 where the `f1-add` function is mocked, 00:14:18.880 --> 00:14:21.999 so that if it's called with 2 and 3 as arguments, 00:14:22.000 --> 00:14:24.919 it will return 10. Then we have a test case 00:14:24.920 --> 00:14:27.999 where we try the `my-add` function, 00:14:28.000 --> 00:14:30.319 as you might remember, and call that with 2 and 3 00:14:30.320 --> 00:14:32.799 and see that it should also then return 10 00:14:32.800 --> 00:14:41.239 because it's using `f1-add`. NOTE cl-letf 00:14:41.240 --> 00:14:44.559 Moving over to `cl-letf`. 00:14:44.560 --> 00:14:47.679 In rare occasions, the limitations of `el-mock` means 00:14:47.680 --> 00:14:50.239 you would want to implement a full-fledged function 00:14:50.240 --> 00:14:52.979 to be used under test. 00:14:52.980 --> 00:14:55.439 Then the macro `cl-letf` can be useful. 00:14:55.440 --> 00:14:57.879 However, you need to handle the case yourself 00:14:57.880 --> 00:15:00.099 if the function was not called. 00:15:00.100 --> 00:15:03.519 Looking through the test cases where I have used `cl-letf`, 00:15:03.520 --> 00:15:06.119 I think most can be implemented using plain mocking. 00:15:06.120 --> 00:15:11.239 Cases left is where the args to the mock might be different 00:15:11.240 --> 00:15:13.739 due to environment issues. 00:15:13.740 --> 00:15:24.099 In that case, a static mock will not work. NOTE Hooks 00:15:24.100 --> 00:15:30.719 Another trick is that functions that uses hooks. 00:15:30.720 --> 00:15:35.639 You can overload or replace the hooks to do the testing. 00:15:35.640 --> 00:15:40.759 So you can use the hook function just to do the verification 00:15:40.760 --> 00:15:43.119 and not do anything useful in the hook. 00:15:43.120 --> 00:15:45.079 Also, here you need to be careful 00:15:45.080 --> 00:15:55.719 to make sure the test handler is called and nothing else. NOTE Side effects and initial buffer state 00:15:55.720 --> 00:15:57.679 So far we have been talking about testing 00:15:57.680 --> 00:15:59.039 and what the function returns. 00:15:59.040 --> 00:16:01.119 In the best of words, we have a pure function 00:16:01.120 --> 00:16:02.959 that only depends on its arguments 00:16:02.960 --> 00:16:04.939 and produces no side effects. 00:16:04.940 --> 00:16:06.899 Many operations produce side effects 00:16:06.900 --> 00:16:09.479 or operate on the contents of buffers 00:16:09.480 --> 00:16:12.379 such as writing a message in the message buffer, 00:16:12.380 --> 00:16:15.659 change the state of a buffer, move point etc. 00:16:15.660 --> 00:16:18.859 Hyperbole is not an exception. Quite the contrary. 00:16:18.860 --> 00:16:20.839 Much of the functions creating links 00:16:20.840 --> 00:16:24.420 are just about updating buffers. 00:16:24.421 --> 00:16:28.559 This poses a special problem for tests. 00:16:28.560 --> 00:16:29.839 The test gets longer 00:16:29.840 --> 00:16:31.919 since you need to create buffers and files, 00:16:31.920 --> 00:16:33.279 initialize the contents. 00:16:33.280 --> 00:16:35.159 Verifying the outcome becomes trickier 00:16:35.160 --> 00:16:39.019 since you need to make sure you look at the right place. 00:16:39.020 --> 00:16:41.039 At the end of the test, you need to clean up, 00:16:41.040 --> 00:16:43.439 both for not leaving a lot of garbage 00:16:43.440 --> 00:16:45.279 in buffers and files around, 00:16:45.280 --> 00:16:48.479 and even worse, not cause later tests 00:16:48.480 --> 00:16:50.959 to depend on the leftovers from the other tests. 00:16:50.960 --> 00:16:53.079 Here are some functions and variables 00:16:53.080 --> 00:17:05.099 I have found useful for this. NOTE with-temp-buffer 00:17:05.100 --> 00:17:09.199 For creating tests: `with-temp-buffer`: 00:17:09.200 --> 00:17:11.919 it provides you a temp buffer that you visit, 00:17:11.920 --> 00:17:13.719 and afterwards, there is no need to clean up. 00:17:13.720 --> 00:17:16.519 This is the first choice if that is all you need. NOTE make-temp-file 00:17:16.520 --> 00:17:20.519 `make-temp-file`: If you need a file, 00:17:20.520 --> 00:17:21.959 this is the function to use. 00:17:21.960 --> 00:17:24.279 It creates a temp file or a directory. 00:17:24.280 --> 00:17:26.959 The file can be filled with initial contents. 00:17:26.960 --> 00:17:31.019 This needs to be cleaned up after a test. 00:17:31.020 --> 00:17:33.287 Moving on to verifying and debugging: NOTE buffer-string 00:17:33.288 --> 00:17:38.247 `buffer-string`: returns the full contents 00:17:38.248 --> 00:17:39.499 of the buffer as a string. 00:17:39.500 --> 00:17:41.399 That can sound a bit voluminous, 00:17:41.400 --> 00:17:46.139 but since tests are normally small, this often works well. 00:17:46.140 --> 00:17:48.439 I have in particular found good use of comparing 00:17:48.440 --> 00:17:50.399 the contents of buffers with the empty string. 00:17:50.400 --> 00:17:53.359 That would give an error, but as we have seen 00:17:53.360 --> 00:17:56.079 with the output produced by the `should` assertion, 00:17:56.080 --> 00:17:58.079 this is almost like a print statement 00:17:58.080 --> 00:18:01.199 and can be compared with the good old technique 00:18:01.200 --> 00:18:04.399 of debugging with print statements. 00:18:04.400 --> 00:18:06.247 There might be other ways to do the same 00:18:06.248 --> 00:18:09.919 as we saw with debugging. NOTE buffer-name 00:18:09.920 --> 00:18:13.719 buffer-name: Getting the buffer name is good 00:18:13.720 --> 00:18:16.239 to verify what buffer we are looking at. 00:18:16.240 --> 00:18:18.359 I often found it useful to check 00:18:18.360 --> 00:18:21.119 that my assumptions on what buffer I am acting on 00:18:21.120 --> 00:18:23.399 is correct by adding `should` clauses 00:18:23.400 --> 00:18:25.399 in the middle of the test execution 00:18:25.400 --> 00:18:27.399 or after preparing the test input. 00:18:27.400 --> 00:18:31.679 Sometimes Emacs can switch buffers in strange ways, 00:18:31.680 --> 00:18:34.199 maybe because the test case is badly written, 00:18:34.200 --> 00:18:37.239 and making sure your assumptions are correct 00:18:37.240 --> 00:18:40.339 is a good sanity check. 00:18:40.340 --> 00:18:42.239 Even the ert package does 00:18:42.240 --> 00:18:44.879 some buffer and windows manipulation for its reporting 00:18:44.880 --> 00:18:47.487 that I have not fully learned how to master, 00:18:47.488 --> 00:18:51.979 so assertion for checking the sanity of the test is good. NOTE major-mode 00:18:51.980 --> 00:18:55.679 Finally, `major-mode`: Verify the buffer has the proper mode. 00:18:55.680 --> 00:19:02.679 Can also be very useful and is a good sanity check. NOTE unwind-protect 00:19:02.680 --> 00:19:06.599 Finally, cleaning up. `unwind-protect`. 00:19:06.600 --> 00:19:09.039 The tool for cleaning up is the `unwind-protect` form 00:19:09.040 --> 00:19:12.479 which ensures that the unwind forms 00:19:12.480 --> 00:19:15.439 always are executed regardless of the outcome of the body. 00:19:15.440 --> 00:19:20.419 So if your test fails, you are sure the cleanup is executed. 00:19:20.420 --> 00:19:22.759 Let's look at unwind-protect together with 00:19:22.760 --> 00:19:30.519 the temporary file example. Many tests look like this. 00:19:30.520 --> 00:19:35.279 You create some resource, you call `unwind-protect`, 00:19:35.280 --> 00:19:42.759 you do the test, and then afterwards you do the cleanup. 00:19:42.760 --> 00:19:46.359 The cleanup for a file and a buffer is so common, 00:19:46.360 --> 00:19:50.999 so I have created a helper for that. 00:19:51.000 --> 00:19:56.559 It looks like this. 00:19:56.560 --> 00:19:59.179 The trick with the `buffer-modified` flag 00:19:59.180 --> 00:20:00.719 is to avoid getting prompted 00:20:00.720 --> 00:20:03.219 for killing a buffer that is not saved. 00:20:03.220 --> 00:20:05.439 The test buffers are often in the state 00:20:05.440 --> 00:20:15.099 where they have not been saved but modified. NOTE Input, with-simulated-input 00:20:15.100 --> 00:20:19.679 Another problem for tests are input. 00:20:19.680 --> 00:20:21.559 In the middle of execution a function 00:20:21.560 --> 00:20:24.039 might want to have some interaction with the user. 00:20:24.040 --> 00:20:26.959 Testing this poses a problem, not only in that 00:20:26.960 --> 00:20:31.199 the input matters, but also as how even to get the test case 00:20:31.200 --> 00:20:34.079 to recognize the input!? 00:20:34.080 --> 00:20:36.039 Ideally the tests are run in batch mode, 00:20:36.040 --> 00:20:38.919 which in some sense means no user interaction. 00:20:38.920 --> 00:20:42.999 In batch mode, there is no event loop running. 00:20:43.000 --> 00:20:47.179 Fortunately, there is a package `with-simulated-input` 00:20:47.180 --> 00:20:53.259 that gets you around these issues. 00:20:53.260 --> 00:20:55.399 This is a macro that allows us 00:20:55.400 --> 00:20:56.999 to define a set of characters 00:20:57.000 --> 00:20:59.079 that will be read by the function under the test, 00:20:59.080 --> 00:21:02.579 and all of this works in batch mode. It looks like this. 00:21:02.580 --> 00:21:04.159 We have `with-simulated-input`, 00:21:04.160 --> 00:21:09.839 and then a string of characters, and then a body. 00:21:09.840 --> 00:21:11.647 The form takes a string of keys 00:21:11.648 --> 00:21:13.119 and runs the rest of the body, 00:21:13.120 --> 00:21:15.439 and if there are input required, 00:21:15.440 --> 00:21:18.119 it is picked from the string of keys. 00:21:18.120 --> 00:21:20.421 In our example, the `read-string` call 00:21:20.422 --> 00:21:21.719 will read up until RET, 00:21:21.720 --> 00:21:26.119 and then return the characters read. 00:21:26.120 --> 00:21:29.639 As you see in the example, space needs to be provided 00:21:29.640 --> 00:21:38.459 by the string SPC, as return by the string RET. NOTE Running all tests 00:21:38.460 --> 00:21:40.799 So now we have seen ways to create test cases 00:21:40.800 --> 00:21:43.219 and even make it possible to run some of them 00:21:43.220 --> 00:21:44.679 that has I/O in batch mode. 00:21:44.680 --> 00:21:47.279 But the initial goal was to run them all at once. 00:21:47.280 --> 00:21:48.919 How do you do that? 00:21:48.920 --> 00:21:51.759 Let's go back to the `ert` command. 00:21:51.760 --> 00:21:53.799 It prompts for a test selector. 00:21:53.800 --> 00:21:56.279 If we give it the selector `t`, 00:21:56.280 --> 00:21:59.259 it will run all tests we have currently defined. 00:21:59.260 --> 00:22:05.779 Let's try that with the subset of the Hyperbole tests. 00:22:05.780 --> 00:22:09.559 Here is the test folder in the Hyperbole directory. 00:22:09.560 --> 00:22:18.819 Let's go up here and load all the demo tests. 00:22:18.820 --> 00:22:21.207 And then try to run `ert`. 00:22:21.208 --> 00:22:26.119 Now we see that we have a bunch of test cases. 00:22:26.120 --> 00:22:27.919 We can all run them individually, 00:22:27.920 --> 00:22:31.719 but we can run them with `t` instead. 00:22:31.720 --> 00:22:35.459 We will run them all at once. 00:22:35.460 --> 00:22:51.419 So now, ert is executing all our test cases. 00:22:51.420 --> 00:22:57.079 So here we have a nice green display 00:22:57.080 --> 00:23:03.219 with all the test cases. NOTE Batch mode 00:23:03.220 --> 00:23:08.159 So that was fine, but we were still running it manually 00:23:08.160 --> 00:23:11.980 by calling ert. How could we run it from the command line? 00:23:17.180 --> 00:23:21.499 Ert comes with functions for running it in batch mode. 00:23:21.500 --> 00:23:25.639 For Hyperbole, we use `make` for repetitive tasks. 00:23:25.640 --> 00:23:27.119 So we have a make target 00:23:27.120 --> 00:23:29.279 that uses the ert batch functionality, 00:23:29.280 --> 00:23:33.259 and this is the line from the Makefile. 00:23:33.260 --> 00:23:35.479 This is a bit detailed, 00:23:35.480 --> 00:23:37.539 but you see that we have a part here 00:23:37.540 --> 00:23:40.779 where we load the test dependencies. 00:23:40.780 --> 00:23:43.520 For getting the packages 00:23:43.521 --> 00:23:48.459 such as `el-mock` and `with-simulated-input` etc. loaded. 00:23:48.460 --> 00:23:53.559 We also have... I also want to point out here the call to 00:23:53.560 --> 00:23:58.159 or the setting of `auto-save-default` to `nil` 00:23:58.160 --> 00:24:02.439 to get away with the prompt for excessive backup files 00:24:02.440 --> 00:24:05.059 that can pile up after running the tests a few times. NOTE Skipping tests 00:24:05.060 --> 00:24:06.879 Even with the help of simulated input, 00:24:06.880 --> 00:24:08.919 not all tests can be run in batch mode. 00:24:08.920 --> 00:24:10.559 They would simply not work there 00:24:10.560 --> 00:24:12.439 and have to be run in an interactive Emacs 00:24:12.440 --> 00:24:14.179 with the running event loop. 00:24:14.180 --> 00:24:17.919 One trick still to be able to use batch mode for automation 00:24:17.920 --> 00:24:20.319 is to put the guard at the top of each test case 00:24:20.320 --> 00:24:22.559 as the first thing to be executed, 00:24:22.560 --> 00:24:25.719 so that it kicks in before anything else and stops Emacs 00:24:25.720 --> 00:24:27.199 to try to run the test case. 00:24:27.200 --> 00:24:35.519 Now, it looks like this: `(skip-unless (not noninteractive))`. 00:24:35.520 --> 00:24:38.639 So when ert sees that the test should be skipped, it skips it 00:24:38.640 --> 00:24:40.439 and makes a note of that, 00:24:40.440 --> 00:24:44.579 so you will see how many tests that have been skipped. 00:24:44.580 --> 00:24:47.559 Too bad. We have a number of test cases defined, 00:24:47.560 --> 00:24:51.359 and to run them, we need to run them manually. Well sort of. 00:24:51.360 --> 00:24:53.807 Not being able to run all tests easily 00:24:53.808 --> 00:24:58.419 is a bit counterproductive 00:24:58.420 --> 00:25:00.999 since our goal is to run all tests. 00:25:01.000 --> 00:25:04.719 There is however no ert function to run tests in batch mode 00:25:04.720 --> 00:25:06.779 with an interactive Emacs. 00:25:06.780 --> 00:25:08.479 The closest I have got is either 00:25:08.480 --> 00:25:10.079 to start the Emacs from the command line 00:25:10.080 --> 00:25:12.439 calling the ert function as we just have seen, 00:25:12.440 --> 00:25:14.799 and then killing it manually when done; 00:25:14.800 --> 00:25:19.599 or add a function to extract the contents of the ERT buffer 00:25:19.600 --> 00:25:24.599 when done and echo it to standard output. 00:25:24.600 --> 00:25:27.800 This is how it looks in the Makefile 00:25:27.801 --> 00:25:31.207 to get the behavior of cutting and paste, 00:25:31.208 --> 00:25:34.580 getting the ERT output into a file 00:25:34.581 --> 00:25:36.239 so we can then kill Emacs 00:25:36.240 --> 00:25:44.799 and spit out the content of the ERT buffer. 00:25:44.800 --> 00:25:47.739 One final word here is that 00:25:47.740 --> 00:25:54.559 when you run this in a continuous integration pipeline, 00:25:54.560 --> 00:25:59.399 you might not have a TTY for getting Emacs to start, 00:25:59.400 --> 00:26:03.200 and that is then another problem 00:26:03.201 --> 00:26:05.160 with getting the interactive mode. NOTE Conclusion 00:26:08.460 --> 00:26:11.120 We have reached the end of the talk. 00:26:11.121 --> 00:26:14.159 If you have any new ideas 00:26:14.160 --> 00:26:16.759 or have some suggestions for improvements, 00:26:16.760 --> 00:26:18.239 feel free to reach out 00:26:18.240 --> 00:26:21.100 because I am still on the learning curve of writing, 00:26:21.101 --> 00:26:25.299 how to write good test cases. 00:26:25.300 --> 00:26:27.639 If you look at the test cases we have in Hyperbole 00:26:27.640 --> 00:26:29.799 and you think they might contradict what I am saying here, 00:26:29.800 --> 00:26:32.579 it is OK. It is probably right. 00:26:32.580 --> 00:26:34.599 I have changed the style as I go 00:26:34.600 --> 00:26:36.639 and we have not yet refactored all tests 00:26:36.640 --> 00:26:38.579 to benefit from new designs. 00:26:38.580 --> 00:26:40.599 That is also the beauty of the test case. 00:26:40.600 --> 00:26:43.319 As long as it serves its purpose, it is not terrible 00:26:43.320 --> 00:26:47.799 if it is not optimal or not having the best style. 00:26:47.800 --> 00:26:55.240 And yes, thanks for listening. Bye.