WEBVTT captioned by sachac, checked by sachac
NOTE Introduction
00:00:03.120 --> 00:00:07.439
Hi everyone! I'm Mats Lidell.
00:00:07.440 --> 00:00:09.879
I'm going to talk about my journey
00:00:09.880 --> 00:00:12.480
writing test cases for GNU Hyperbole
00:00:12.481 --> 00:00:19.399
and what I learned on the way.
00:00:19.400 --> 00:00:24.079
So, why write tests for GNU Hyperbole?
00:00:24.080 --> 00:00:25.679
There is some background.
00:00:25.680 --> 00:00:27.959
I'm the co-maintainer of GNU Hyperbole
00:00:27.960 --> 00:00:33.479
together with Bob Weiner. Bob is the author of the package.
00:00:33.480 --> 00:00:34.680
The package is available through
00:00:34.681 --> 00:00:38.799
the Emacs package manager and GNU Elpa
00:00:38.800 --> 00:00:42.599
if you would want to try it out.
00:00:42.600 --> 00:00:46.359
The package has some age. I think it dates back to
00:00:46.360 --> 00:00:50.119
a first release around 1993, which is also
00:00:50.120 --> 00:00:54.799
when I got in contact with the package the first time.
00:00:54.800 --> 00:00:58.239
I was a user of the package for many years.
00:00:58.240 --> 00:01:03.119
Later, I became the maintainer of the package for the FSF.
00:01:03.120 --> 00:01:04.679
That was although I did not have
00:01:04.680 --> 00:01:09.039
much knowledge of Emacs Lisp,
00:01:09.040 --> 00:01:12.679
and I still have a lot to learn.
00:01:12.680 --> 00:01:15.959
A few years ago, we started to work actively on the package,
00:01:15.960 --> 00:01:20.839
with setting up goals and having meetings.
00:01:20.840 --> 00:01:24.959
So my starting point is that I had experience
00:01:24.960 --> 00:01:27.439
with test automation from development
00:01:27.440 --> 00:01:30.599
in C++, Java and Python
00:01:30.600 --> 00:01:37.239
using different x-unit frameworks like cppunit, junit.
00:01:37.240 --> 00:01:40.039
That was in my daytime work where
00:01:40.040 --> 00:01:41.959
the technique of using pull requests
00:01:41.960 --> 00:01:46.719
with changes backed up by tests were the daily routine.
00:01:46.720 --> 00:01:49.199
It was really a requirement for a change to go in
00:01:49.200 --> 00:01:52.159
to have supporting test cases.
00:01:52.160 --> 00:01:58.559
I believe, a quite common setup and requirement these days.
00:01:58.560 --> 00:02:02.039
I also had been an Emacs user for many years,
00:02:02.040 --> 00:02:04.279
but with focus on being a user.
00:02:04.280 --> 00:02:09.839
So as I mentioned, I have limited Emacs Lisp knowledge.
00:02:09.840 --> 00:02:11.359
When we decided to start
00:02:11.360 --> 00:02:13.959
to work actively on Hyperbole again,
00:02:13.960 --> 00:02:15.519
it was natural for me to look into
00:02:15.520 --> 00:02:18.679
raising the quality by adding unit tests.
00:02:18.680 --> 00:02:20.679
This also goes hand in hand
00:02:20.680 --> 00:02:25.239
with running these regularly as part of a build process.
00:02:25.240 --> 00:02:28.439
All in all, following the current best practice
00:02:28.440 --> 00:02:31.359
of software development.
00:02:31.360 --> 00:02:36.479
But since Hyperbole had no tests at all,
00:02:36.480 --> 00:02:38.719
it would not be enough just to add tests
00:02:38.720 --> 00:02:41.799
for new or changed functionality.
00:02:41.800 --> 00:02:44.639
We wanted to add it even broader; ideally, everywhere.
00:02:44.640 --> 00:02:48.399
So work started with adding tests here and there
00:02:48.400 --> 00:02:52.039
based on our gut feeling where it would be most useful.
00:02:52.040 --> 00:02:55.799
This work is still ongoing.
00:02:55.800 --> 00:02:58.119
So this is where my journey starts
00:02:58.120 --> 00:03:00.759
with much functionality to test,
00:03:00.760 --> 00:03:03.359
no knowledge of what testing frameworks existed,
00:03:03.360 --> 00:03:11.159
and not really knowing a lot about Emacs Lisp at all.
NOTE ERT: Emacs Lisp Regression Testing
00:03:11.160 --> 00:03:13.799
Luckily there is a package for writing tests in Emacs.
00:03:13.800 --> 00:03:17.919
It is called ERT: Emacs Lisp Regression Testing.
00:03:17.920 --> 00:03:20.959
It contains both support for defining tests and running them.
00:03:20.960 --> 00:03:24.639
Defining a test is done with the macro `ert-deftest`.
00:03:24.640 --> 00:03:28.919
In its simplest form, a test has a name, a doc string, and a body.
00:03:28.920 --> 00:03:31.439
The doc string is where you typically can give
00:03:31.440 --> 00:03:33.799
a detailed description of the test
00:03:33.800 --> 00:03:35.559
and has space for more info
00:03:35.560 --> 00:03:42.279
than what can be given in the test name.
00:03:42.280 --> 00:03:45.239
The body is where all the interesting things happen.
00:03:45.240 --> 00:03:51.959
It is here you prepare the test, run it and verify the outcome.
00:03:51.960 --> 00:03:54.239
Schematically, it looks like this.
00:03:54.240 --> 00:04:00.239
You have the ert-deftest, you have the test name,
00:04:00.240 --> 00:04:02.799
and the doc string, and then the body.
00:04:02.800 --> 00:04:06.559
It is in the body where everything interesting happens.
00:04:06.560 --> 00:04:09.759
The test is prepared, the function of the test is executed,
00:04:09.760 --> 00:04:13.119
and the outcome of the test is evaluated.
00:04:13.120 --> 00:04:14.359
Did the test succeed or not?
NOTE Assertions with `should`
00:04:14.360 --> 00:04:18.479
The verification of a test is performed with
00:04:18.480 --> 00:04:21.479
one or more so-called assertions.
00:04:21.480 --> 00:04:24.999
In ERT, they are implemented
00:04:25.000 --> 00:04:26.599
with the macro `should`
00:04:26.600 --> 00:04:33.559
together with a set of related macros.
00:04:33.560 --> 00:04:35.519
`should` takes a form as argument,
00:04:35.520 --> 00:04:37.839
and if the form evaluates to nil,
00:04:37.840 --> 00:04:48.580
the test has failed. So let's look at an example.
00:04:48.581 --> 00:04:51.919
This simple test verifies that the function `+`
00:04:51.920 --> 00:04:56.919
can add the numbers 2 and 3 and get the result 5.
NOTE Running a test case
00:04:56.920 --> 00:05:01.959
So now we have defined a test case. How do we run it?
00:05:01.960 --> 00:05:03.919
The ERT package has the function (or
00:05:03.920 --> 00:05:09.519
rather convenience alias) `ert`. It takes a test selector.
00:05:09.520 --> 00:05:19.759
The test name works as a selector for running just one test.
00:05:19.760 --> 00:05:27.900
So here we have the example. Let's evaluate it.
00:05:27.901 --> 00:05:34.519
We define it and then we run it using ERT.
00:05:34.520 --> 00:05:42.399
As you see, we get prompted for a test selector
00:05:42.400 --> 00:05:46.319
but we only have one test case defined at the moment.
00:05:46.320 --> 00:05:55.919
It's the example 0. So let's hit RET.
00:05:55.920 --> 00:05:58.959
As you see here, we get some output
00:05:58.960 --> 00:06:01.359
describing what we have just done.
00:06:01.360 --> 00:06:04.839
There is one test case it has passed, zero failed,
00:06:04.840 --> 00:06:07.839
zero skipped, total 1 of 1 test case
00:06:07.840 --> 00:06:14.439
and some time stamps for the execution.
00:06:14.440 --> 00:06:18.519
We also see this green mark here indicating one test case
00:06:18.520 --> 00:06:23.039
and that it was successful.
00:06:23.040 --> 00:06:29.659
For inspecting the test, we can hit the letter `l`
00:06:29.660 --> 00:06:32.839
which shows all the `should` forms
00:06:32.840 --> 00:06:37.779
that was executed during this test case.
00:06:37.780 --> 00:06:39.919
So here we see that we have the `should`,
00:06:39.920 --> 00:06:47.999
one `should` executed, and we see the form equals to 2,
00:06:48.000 --> 00:06:49.799
and it was 5 equals to 5.
00:06:49.800 --> 00:06:54.559
So a good example of a successful test case.
NOTE Debug a test
00:06:54.560 --> 00:06:57.919
So now we've seen how we can run a test case.
00:06:57.920 --> 00:07:03.799
Can we debug it? Yes. For debugging a test case,
00:07:03.800 --> 00:07:07.939
the `ert-deftest` can be set up using `edebug-defun`,
00:07:07.940 --> 00:07:10.319
just as a function or macro is set up
00:07:10.320 --> 00:07:18.819
or instrumented for debugging. So let's try that.
00:07:18.820 --> 00:07:24.119
So we try `edebug-defun` here.
00:07:24.120 --> 00:07:28.279
Now it's instrumented for debugging.
00:07:28.280 --> 00:07:35.659
And we run it, `ert`, and we're inside the debugger,
00:07:35.660 --> 00:07:40.679
and we can inspect here what's happening.
00:07:40.680 --> 00:07:46.960
Step through it and yes it succeeded just as before.
NOTE Commercial break: Hyperbole
00:07:50.380 --> 00:07:56.879
It's time for a commercial break!
00:07:56.880 --> 00:08:00.079
Hyperbole itself can help with running tests
00:08:00.080 --> 00:08:03.639
and also help with running them in debug mode.
00:08:03.640 --> 00:08:08.519
That is because hyperbole identifies the `ert-deftest`
00:08:08.520 --> 00:08:12.679
as an implicit button. An implicit button is basically
00:08:12.680 --> 00:08:13.759
a string or pattern
00:08:13.760 --> 00:08:16.799
that Hyperbole has assigned some meaning to.
00:08:16.800 --> 00:08:19.959
For the string `ert-deftest`, it is to run the test case.
00:08:19.960 --> 00:08:24.559
You activate the button with the action-key.
00:08:24.560 --> 00:08:27.079
The standard binding is the middle mouse button,
00:08:27.080 --> 00:08:33.040
or from the keyboard, M-RET.
00:08:33.041 --> 00:08:34.799
So let's try that.
00:08:34.800 --> 00:08:42.219
We move the cursor here and then we type M-RET.
00:08:42.220 --> 00:08:47.959
And boom, the test case was executed.
00:08:47.960 --> 00:08:54.479
And to run it in debug mode we type C-u M-RET
00:08:54.480 --> 00:08:57.719
to get the assist key, and then we're in the debugger.
00:08:57.720 --> 00:09:10.479
So that's pretty useful and convenient.
NOTE Instrument function on the fly
00:09:10.480 --> 00:09:13.719
A related useful feature here is the step-in functionality
00:09:13.720 --> 00:09:16.399
bound to the letter i in `debug-mode`.
00:09:16.400 --> 00:09:18.119
It allows you to step into a function
00:09:18.120 --> 00:09:20.479
and continue debugging from there.
00:09:20.480 --> 00:09:22.839
For the cases where your test does not do what you want,
00:09:22.840 --> 00:09:25.119
looking at what happens in the function of the test
00:09:25.120 --> 00:09:37.259
can be really useful. Let's try that with another example.
00:09:37.260 --> 00:09:43.359
So here we have two helper functions, one `f1-add`,
00:09:43.360 --> 00:09:47.439
that use the built-in `+` function
00:09:47.440 --> 00:09:52.239
and then we have `my-add` that uses that function.
00:09:52.240 --> 00:09:59.399
So we're going to test myadd.
00:09:59.400 --> 00:10:02.919
And then let's run this.
00:10:02.920 --> 00:10:05.959
Let's run this using hyperbole in debug mode
00:10:05.960 --> 00:10:10.079
C-u M-RET. We're in the debugger again,
00:10:10.080 --> 00:10:15.639
and let's step up front to my function under test
00:10:15.640 --> 00:10:19.359
and then press `i` for getting it instrumented
00:10:19.360 --> 00:10:23.019
and going into it for debugging.
00:10:23.020 --> 00:10:25.139
And here we can expect that it's getting
00:10:25.140 --> 00:10:26.559
the arguments 1 and 3,
00:10:26.560 --> 00:10:30.999
and it returns the result 4 as expected.
00:10:31.000 --> 00:10:39.119
And yes, of course, our test case will then succeed.
NOTE Mocking
00:10:39.120 --> 00:10:41.839
The next tool in our toolbox is mocking.
00:10:41.840 --> 00:10:46.239
Mocking is needed when we want to simulate the response
00:10:46.240 --> 00:10:49.279
from a function used by the function under test.
00:10:49.280 --> 00:10:53.139
That is the implementation of the function.
00:10:53.140 --> 00:10:56.119
This could be for various reasons.
00:10:56.120 --> 00:11:00.879
One example could be because it would be hard or impossible
00:11:00.880 --> 00:11:04.199
in the test setup to get the behavior you want to test for,
00:11:04.200 --> 00:11:06.279
like an external error case.
00:11:06.280 --> 00:11:08.679
But the mock can also be used to verify
00:11:08.680 --> 00:11:11.619
that the function is called with a specific argument.
00:11:11.620 --> 00:11:14.559
We can view it as a way to isolate the function on the test
00:11:14.560 --> 00:11:16.719
from its dependencies.
00:11:16.720 --> 00:11:18.959
So in order to test the function in isolation,
00:11:18.960 --> 00:11:22.079
we need to cut out any dependencies to external behavior.
00:11:22.080 --> 00:11:25.839
Most obvious would be dependencies to external resources,
00:11:25.840 --> 00:11:27.639
such as web pages. As an example:
00:11:27.640 --> 00:11:30.639
Hyperbole contains functionality to link you to
00:11:30.640 --> 00:11:34.239
social media resources and other resources on the net.
00:11:34.240 --> 00:11:37.899
Testing that would require the test system to call out
00:11:37.900 --> 00:11:39.639
to the social media resources
00:11:39.640 --> 00:11:43.539
and would depend on it being available, etc.
00:11:43.540 --> 00:11:45.479
Nothing technically stops a test case
00:11:45.480 --> 00:11:47.239
to depend on the external resources,
00:11:47.240 --> 00:11:51.319
but would, if nothing else, be flaky or slow.
00:11:51.320 --> 00:11:53.759
It could be part of an end-to-end suite
00:11:53.760 --> 00:11:57.179
where we want to test that it works all the way.
00:11:57.180 --> 00:11:59.719
In this case, we want to look at the isolated case
00:11:59.720 --> 00:12:04.099
that can be run with no dependency on external resources.
00:12:04.100 --> 00:12:06.679
What you want to do is to replace the function with a mock
00:12:06.680 --> 00:12:10.339
that behaves as the real function would do.
00:12:10.340 --> 00:12:11.639
The package I have found
00:12:11.640 --> 00:12:14.319
and have used for mocking is `el-mock`.
00:12:14.320 --> 00:12:21.839
The workhorse in this package is the `with-mock` macro.
00:12:21.840 --> 00:12:26.519
It looks like this: `with-mock` followed by a body.
00:12:26.520 --> 00:12:30.439
In the execution of the body, stubs and mocks
00:12:30.440 --> 00:12:32.899
defined in the body is respected.
00:12:32.900 --> 00:12:39.199
Let's look at some examples to make that clearer.
00:12:39.200 --> 00:12:42.079
In this case, we have the macro `with-mock`.
00:12:42.080 --> 00:12:43.959
It works so that the expression
00:12:43.960 --> 00:12:48.639
`stub + => 10` is interpreted
00:12:48.640 --> 00:12:51.919
so that the function `+` will be replaced with the stub.
00:12:51.920 --> 00:12:56.779
The stub will return 10 regardless how it is called.
00:12:56.780 --> 00:12:58.119
Note that the stub function
00:12:58.120 --> 00:13:00.199
does not have to be called at this level
00:13:00.200 --> 00:13:02.799
but could be called at any level in the call chain.
00:13:02.800 --> 00:13:07.479
By knowing how the function under test is implemented
00:13:07.480 --> 00:13:09.319
and how the implementation works,
00:13:09.320 --> 00:13:11.959
you can find function calls you want to mock
00:13:11.960 --> 00:13:14.999
to force certain behavior that you want to test,
00:13:15.000 --> 00:13:18.999
or to avoid calls to external resources, slow calls, etc.
00:13:19.000 --> 00:13:21.959
Simply isolate the function under test
00:13:21.960 --> 00:13:26.119
and simulate its environment.
00:13:26.120 --> 00:13:28.639
Mock is a little bit more sophisticated
00:13:28.640 --> 00:13:30.079
and depends on the arguments
00:13:30.080 --> 00:13:31.479
that the mock function is called with.
00:13:31.480 --> 00:13:33.847
Or more precise, it is checked
00:13:33.848 --> 00:13:35.519
after the `with-mock` clause
00:13:35.520 --> 00:13:38.079
that the arguments match the arguments it was called with
00:13:38.080 --> 00:13:39.759
or even if it was called at all.
00:13:39.760 --> 00:13:41.839
If it is called with other arguments
00:13:41.840 --> 00:13:43.719
there will be an error,
00:13:43.720 --> 00:13:46.479
and if it's not called, it is also an error.
00:13:46.480 --> 00:13:48.359
So this way, we are sure that the function
00:13:48.360 --> 00:13:51.319
we were expected to be called actually was called.
00:13:51.320 --> 00:13:53.399
An important piece of the testing.
00:13:53.400 --> 00:13:56.239
So we are sure that the mock we have provided
00:13:56.240 --> 00:14:03.999
actually is triggered by the test case.
00:14:04.000 --> 00:14:08.159
So here we have an example of `with-mock`
00:14:08.160 --> 00:14:18.879
where the `f1-add` function is mocked,
00:14:18.880 --> 00:14:21.999
so that if it's called with 2 and 3 as arguments,
00:14:22.000 --> 00:14:24.919
it will return 10. Then we have a test case
00:14:24.920 --> 00:14:27.999
where we try the `my-add` function,
00:14:28.000 --> 00:14:30.319
as you might remember, and call that with 2 and 3
00:14:30.320 --> 00:14:32.799
and see that it should also then return 10
00:14:32.800 --> 00:14:41.239
because it's using `f1-add`.
NOTE cl-letf
00:14:41.240 --> 00:14:44.559
Moving over to `cl-letf`.
00:14:44.560 --> 00:14:47.679
In rare occasions, the limitations of `el-mock` means
00:14:47.680 --> 00:14:50.239
you would want to implement a full-fledged function
00:14:50.240 --> 00:14:52.979
to be used under test.
00:14:52.980 --> 00:14:55.439
Then the macro `cl-letf` can be useful.
00:14:55.440 --> 00:14:57.879
However, you need to handle the case yourself
00:14:57.880 --> 00:15:00.099
if the function was not called.
00:15:00.100 --> 00:15:03.519
Looking through the test cases where I have used `cl-letf`,
00:15:03.520 --> 00:15:06.119
I think most can be implemented using plain mocking.
00:15:06.120 --> 00:15:11.239
Cases left is where the args to the mock might be different
00:15:11.240 --> 00:15:13.739
due to environment issues.
00:15:13.740 --> 00:15:24.099
In that case, a static mock will not work.
NOTE Hooks
00:15:24.100 --> 00:15:30.719
Another trick is that functions that uses hooks.
00:15:30.720 --> 00:15:35.639
You can overload or replace the hooks to do the testing.
00:15:35.640 --> 00:15:40.759
So you can use the hook function just to do the verification
00:15:40.760 --> 00:15:43.119
and not do anything useful in the hook.
00:15:43.120 --> 00:15:45.079
Also, here you need to be careful
00:15:45.080 --> 00:15:55.719
to make sure the test handler is called and nothing else.
NOTE Side effects and initial buffer state
00:15:55.720 --> 00:15:57.679
So far we have been talking about testing
00:15:57.680 --> 00:15:59.039
and what the function returns.
00:15:59.040 --> 00:16:01.119
In the best of words, we have a pure function
00:16:01.120 --> 00:16:02.959
that only depends on its arguments
00:16:02.960 --> 00:16:04.939
and produces no side effects.
00:16:04.940 --> 00:16:06.899
Many operations produce side effects
00:16:06.900 --> 00:16:09.479
or operate on the contents of buffers
00:16:09.480 --> 00:16:12.379
such as writing a message in the message buffer,
00:16:12.380 --> 00:16:15.659
change the state of a buffer, move point etc.
00:16:15.660 --> 00:16:18.859
Hyperbole is not an exception. Quite the contrary.
00:16:18.860 --> 00:16:20.839
Much of the functions creating links
00:16:20.840 --> 00:16:24.420
are just about updating buffers.
00:16:24.421 --> 00:16:28.559
This poses a special problem for tests.
00:16:28.560 --> 00:16:29.839
The test gets longer
00:16:29.840 --> 00:16:31.919
since you need to create buffers and files,
00:16:31.920 --> 00:16:33.279
initialize the contents.
00:16:33.280 --> 00:16:35.159
Verifying the outcome becomes trickier
00:16:35.160 --> 00:16:39.019
since you need to make sure you look at the right place.
00:16:39.020 --> 00:16:41.039
At the end of the test, you need to clean up,
00:16:41.040 --> 00:16:43.439
both for not leaving a lot of garbage
00:16:43.440 --> 00:16:45.279
in buffers and files around,
00:16:45.280 --> 00:16:48.479
and even worse, not cause later tests
00:16:48.480 --> 00:16:50.959
to depend on the leftovers from the other tests.
00:16:50.960 --> 00:16:53.079
Here are some functions and variables
00:16:53.080 --> 00:17:05.099
I have found useful for this.
NOTE with-temp-buffer
00:17:05.100 --> 00:17:09.199
For creating tests: `with-temp-buffer`:
00:17:09.200 --> 00:17:11.919
it provides you a temp buffer that you visit,
00:17:11.920 --> 00:17:13.719
and afterwards, there is no need to clean up.
00:17:13.720 --> 00:17:16.519
This is the first choice if that is all you need.
NOTE make-temp-file
00:17:16.520 --> 00:17:20.519
`make-temp-file`: If you need a file,
00:17:20.520 --> 00:17:21.959
this is the function to use.
00:17:21.960 --> 00:17:24.279
It creates a temp file or a directory.
00:17:24.280 --> 00:17:26.959
The file can be filled with initial contents.
00:17:26.960 --> 00:17:31.019
This needs to be cleaned up after a test.
00:17:31.020 --> 00:17:33.287
Moving on to verifying and debugging:
NOTE buffer-string
00:17:33.288 --> 00:17:38.247
`buffer-string`: returns the full contents
00:17:38.248 --> 00:17:39.499
of the buffer as a string.
00:17:39.500 --> 00:17:41.399
That can sound a bit voluminous,
00:17:41.400 --> 00:17:46.139
but since tests are normally small, this often works well.
00:17:46.140 --> 00:17:48.439
I have in particular found good use of comparing
00:17:48.440 --> 00:17:50.399
the contents of buffers with the empty string.
00:17:50.400 --> 00:17:53.359
That would give an error, but as we have seen
00:17:53.360 --> 00:17:56.079
with the output produced by the `should` assertion,
00:17:56.080 --> 00:17:58.079
this is almost like a print statement
00:17:58.080 --> 00:18:01.199
and can be compared with the good old technique
00:18:01.200 --> 00:18:04.399
of debugging with print statements.
00:18:04.400 --> 00:18:06.247
There might be other ways to do the same
00:18:06.248 --> 00:18:09.919
as we saw with debugging.
NOTE buffer-name
00:18:09.920 --> 00:18:13.719
buffer-name: Getting the buffer name is good
00:18:13.720 --> 00:18:16.239
to verify what buffer we are looking at.
00:18:16.240 --> 00:18:18.359
I often found it useful to check
00:18:18.360 --> 00:18:21.119
that my assumptions on what buffer I am acting on
00:18:21.120 --> 00:18:23.399
is correct by adding `should` clauses
00:18:23.400 --> 00:18:25.399
in the middle of the test execution
00:18:25.400 --> 00:18:27.399
or after preparing the test input.
00:18:27.400 --> 00:18:31.679
Sometimes Emacs can switch buffers in strange ways,
00:18:31.680 --> 00:18:34.199
maybe because the test case is badly written,
00:18:34.200 --> 00:18:37.239
and making sure your assumptions are correct
00:18:37.240 --> 00:18:40.339
is a good sanity check.
00:18:40.340 --> 00:18:42.239
Even the ert package does
00:18:42.240 --> 00:18:44.879
some buffer and windows manipulation for its reporting
00:18:44.880 --> 00:18:47.487
that I have not fully learned how to master,
00:18:47.488 --> 00:18:51.979
so assertion for checking the sanity of the test is good.
NOTE major-mode
00:18:51.980 --> 00:18:55.679
Finally, `major-mode`: Verify the buffer has the proper mode.
00:18:55.680 --> 00:19:02.679
Can also be very useful and is a good sanity check.
NOTE unwind-protect
00:19:02.680 --> 00:19:06.599
Finally, cleaning up. `unwind-protect`.
00:19:06.600 --> 00:19:09.039
The tool for cleaning up is the `unwind-protect` form
00:19:09.040 --> 00:19:12.479
which ensures that the unwind forms
00:19:12.480 --> 00:19:15.439
always are executed regardless of the outcome of the body.
00:19:15.440 --> 00:19:20.419
So if your test fails, you are sure the cleanup is executed.
00:19:20.420 --> 00:19:22.759
Let's look at unwind-protect together with
00:19:22.760 --> 00:19:30.519
the temporary file example. Many tests look like this.
00:19:30.520 --> 00:19:35.279
You create some resource, you call `unwind-protect`,
00:19:35.280 --> 00:19:42.759
you do the test, and then afterwards you do the cleanup.
00:19:42.760 --> 00:19:46.359
The cleanup for a file and a buffer is so common,
00:19:46.360 --> 00:19:50.999
so I have created a helper for that.
00:19:51.000 --> 00:19:56.559
It looks like this.
00:19:56.560 --> 00:19:59.179
The trick with the `buffer-modified` flag
00:19:59.180 --> 00:20:00.719
is to avoid getting prompted
00:20:00.720 --> 00:20:03.219
for killing a buffer that is not saved.
00:20:03.220 --> 00:20:05.439
The test buffers are often in the state
00:20:05.440 --> 00:20:15.099
where they have not been saved but modified.
NOTE Input, with-simulated-input
00:20:15.100 --> 00:20:19.679
Another problem for tests are input.
00:20:19.680 --> 00:20:21.559
In the middle of execution a function
00:20:21.560 --> 00:20:24.039
might want to have some interaction with the user.
00:20:24.040 --> 00:20:26.959
Testing this poses a problem, not only in that
00:20:26.960 --> 00:20:31.199
the input matters, but also as how even to get the test case
00:20:31.200 --> 00:20:34.079
to recognize the input!?
00:20:34.080 --> 00:20:36.039
Ideally the tests are run in batch mode,
00:20:36.040 --> 00:20:38.919
which in some sense means no user interaction.
00:20:38.920 --> 00:20:42.999
In batch mode, there is no event loop running.
00:20:43.000 --> 00:20:47.179
Fortunately, there is a package `with-simulated-input`
00:20:47.180 --> 00:20:53.259
that gets you around these issues.
00:20:53.260 --> 00:20:55.399
This is a macro that allows us
00:20:55.400 --> 00:20:56.999
to define a set of characters
00:20:57.000 --> 00:20:59.079
that will be read by the function under the test,
00:20:59.080 --> 00:21:02.579
and all of this works in batch mode. It looks like this.
00:21:02.580 --> 00:21:04.159
We have `with-simulated-input`,
00:21:04.160 --> 00:21:09.839
and then a string of characters, and then a body.
00:21:09.840 --> 00:21:11.647
The form takes a string of keys
00:21:11.648 --> 00:21:13.119
and runs the rest of the body,
00:21:13.120 --> 00:21:15.439
and if there are input required,
00:21:15.440 --> 00:21:18.119
it is picked from the string of keys.
00:21:18.120 --> 00:21:20.421
In our example, the `read-string` call
00:21:20.422 --> 00:21:21.719
will read up until RET,
00:21:21.720 --> 00:21:26.119
and then return the characters read.
00:21:26.120 --> 00:21:29.639
As you see in the example, space needs to be provided
00:21:29.640 --> 00:21:38.459
by the string SPC, as return by the string RET.
NOTE Running all tests
00:21:38.460 --> 00:21:40.799
So now we have seen ways to create test cases
00:21:40.800 --> 00:21:43.219
and even make it possible to run some of them
00:21:43.220 --> 00:21:44.679
that has I/O in batch mode.
00:21:44.680 --> 00:21:47.279
But the initial goal was to run them all at once.
00:21:47.280 --> 00:21:48.919
How do you do that?
00:21:48.920 --> 00:21:51.759
Let's go back to the `ert` command.
00:21:51.760 --> 00:21:53.799
It prompts for a test selector.
00:21:53.800 --> 00:21:56.279
If we give it the selector `t`,
00:21:56.280 --> 00:21:59.259
it will run all tests we have currently defined.
00:21:59.260 --> 00:22:05.779
Let's try that with the subset of the Hyperbole tests.
00:22:05.780 --> 00:22:09.559
Here is the test folder in the Hyperbole directory.
00:22:09.560 --> 00:22:18.819
Let's go up here and load all the demo tests.
00:22:18.820 --> 00:22:21.207
And then try to run `ert`.
00:22:21.208 --> 00:22:26.119
Now we see that we have a bunch of test cases.
00:22:26.120 --> 00:22:27.919
We can all run them individually,
00:22:27.920 --> 00:22:31.719
but we can run them with `t` instead.
00:22:31.720 --> 00:22:35.459
We will run them all at once.
00:22:35.460 --> 00:22:51.419
So now, ert is executing all our test cases.
00:22:51.420 --> 00:22:57.079
So here we have a nice green display
00:22:57.080 --> 00:23:03.219
with all the test cases.
NOTE Batch mode
00:23:03.220 --> 00:23:08.159
So that was fine, but we were still running it manually
00:23:08.160 --> 00:23:11.980
by calling ert. How could we run it from the command line?
00:23:17.180 --> 00:23:21.499
Ert comes with functions for running it in batch mode.
00:23:21.500 --> 00:23:25.639
For Hyperbole, we use `make` for repetitive tasks.
00:23:25.640 --> 00:23:27.119
So we have a make target
00:23:27.120 --> 00:23:29.279
that uses the ert batch functionality,
00:23:29.280 --> 00:23:33.259
and this is the line from the Makefile.
00:23:33.260 --> 00:23:35.479
This is a bit detailed,
00:23:35.480 --> 00:23:37.539
but you see that we have a part here
00:23:37.540 --> 00:23:40.779
where we load the test dependencies.
00:23:40.780 --> 00:23:43.520
For getting the packages
00:23:43.521 --> 00:23:48.459
such as `el-mock` and `with-simulated-input` etc. loaded.
00:23:48.460 --> 00:23:53.559
We also have... I also want to point out here the call to
00:23:53.560 --> 00:23:58.159
or the setting of `auto-save-default` to `nil`
00:23:58.160 --> 00:24:02.439
to get away with the prompt for excessive backup files
00:24:02.440 --> 00:24:05.059
that can pile up after running the tests a few times.
NOTE Skipping tests
00:24:05.060 --> 00:24:06.879
Even with the help of simulated input,
00:24:06.880 --> 00:24:08.919
not all tests can be run in batch mode.
00:24:08.920 --> 00:24:10.559
They would simply not work there
00:24:10.560 --> 00:24:12.439
and have to be run in an interactive Emacs
00:24:12.440 --> 00:24:14.179
with the running event loop.
00:24:14.180 --> 00:24:17.919
One trick still to be able to use batch mode for automation
00:24:17.920 --> 00:24:20.319
is to put the guard at the top of each test case
00:24:20.320 --> 00:24:22.559
as the first thing to be executed,
00:24:22.560 --> 00:24:25.719
so that it kicks in before anything else and stops Emacs
00:24:25.720 --> 00:24:27.199
to try to run the test case.
00:24:27.200 --> 00:24:35.519
Now, it looks like this: `(skip-unless (not noninteractive))`.
00:24:35.520 --> 00:24:38.639
So when ert sees that the test should be skipped, it skips it
00:24:38.640 --> 00:24:40.439
and makes a note of that,
00:24:40.440 --> 00:24:44.579
so you will see how many tests that have been skipped.
00:24:44.580 --> 00:24:47.559
Too bad. We have a number of test cases defined,
00:24:47.560 --> 00:24:51.359
and to run them, we need to run them manually. Well sort of.
00:24:51.360 --> 00:24:53.807
Not being able to run all tests easily
00:24:53.808 --> 00:24:58.419
is a bit counterproductive
00:24:58.420 --> 00:25:00.999
since our goal is to run all tests.
00:25:01.000 --> 00:25:04.719
There is however no ert function to run tests in batch mode
00:25:04.720 --> 00:25:06.779
with an interactive Emacs.
00:25:06.780 --> 00:25:08.479
The closest I have got is either
00:25:08.480 --> 00:25:10.079
to start the Emacs from the command line
00:25:10.080 --> 00:25:12.439
calling the ert function as we just have seen,
00:25:12.440 --> 00:25:14.799
and then killing it manually when done;
00:25:14.800 --> 00:25:19.599
or add a function to extract the contents of the ERT buffer
00:25:19.600 --> 00:25:24.599
when done and echo it to standard output.
00:25:24.600 --> 00:25:27.800
This is how it looks in the Makefile
00:25:27.801 --> 00:25:31.207
to get the behavior of cutting and paste,
00:25:31.208 --> 00:25:34.580
getting the ERT output into a file
00:25:34.581 --> 00:25:36.239
so we can then kill Emacs
00:25:36.240 --> 00:25:44.799
and spit out the content of the ERT buffer.
00:25:44.800 --> 00:25:47.739
One final word here is that
00:25:47.740 --> 00:25:54.559
when you run this in a continuous integration pipeline,
00:25:54.560 --> 00:25:59.399
you might not have a TTY for getting Emacs to start,
00:25:59.400 --> 00:26:03.200
and that is then another problem
00:26:03.201 --> 00:26:05.160
with getting the interactive mode.
NOTE Conclusion
00:26:08.460 --> 00:26:11.120
We have reached the end of the talk.
00:26:11.121 --> 00:26:14.159
If you have any new ideas
00:26:14.160 --> 00:26:16.759
or have some suggestions for improvements,
00:26:16.760 --> 00:26:18.239
feel free to reach out
00:26:18.240 --> 00:26:21.100
because I am still on the learning curve of writing,
00:26:21.101 --> 00:26:25.299
how to write good test cases.
00:26:25.300 --> 00:26:27.639
If you look at the test cases we have in Hyperbole
00:26:27.640 --> 00:26:29.799
and you think they might contradict what I am saying here,
00:26:29.800 --> 00:26:32.579
it is OK. It is probably right.
00:26:32.580 --> 00:26:34.599
I have changed the style as I go
00:26:34.600 --> 00:26:36.639
and we have not yet refactored all tests
00:26:36.640 --> 00:26:38.579
to benefit from new designs.
00:26:38.580 --> 00:26:40.599
That is also the beauty of the test case.
00:26:40.600 --> 00:26:43.319
As long as it serves its purpose, it is not terrible
00:26:43.320 --> 00:26:47.799
if it is not optimal or not having the best style.
00:26:47.800 --> 00:26:55.240
And yes, thanks for listening. Bye.