During a mentoring session recently, I had a conversation about using mocks in unit tests: when and how they are appropriate, when I feel they enhance tests, and what I don't like about them. Here are my thoughts in writing for future reference.
First, I use several mocking-related terms here, and I go by George Meszaros's definitions for those words. Here they are in a snappier form as they appear on Martin Fowler's article on this very same topic (which is a recommended read on this subject):
Meszaros uses the term Test Double as the generic term for any kind of pretend object used in place of a real object for testing purposes (…) [He] defined five particular kinds of double:
- Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
- Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
- Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed in for the test.
- Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages it was sent.
- Mocks are what we are talking about here: objects pre-programmed with expectations which form a specification of the calls they are expected to receive.
Once that's out of the way, let's get to the code that was subject of the conversation. It included a test that I wasn't convinced about, and looked a bit like this:
1describe QueueFlushWorker do 2 it "clears half of the dead jobs" do 3 some_dead_jobs_stub = double(clear!: true) 4 all_dead_jobs_stub = double(first: some_dead_jobs_stub) 5 dead_queue_stub = double(size: 4, jobs: all_dead_jobs_stub) 6 allow(Queue).to receive(:read).and_return(dead_queue_stub) 7 8 worker = QueueFlushWorker.new 9 worker.process 10 11 expect(all_dead_jobs_stub).to have_received(:first).with(2) 12 expect(some_dead_jobs_stub).to have_received(:clear!) 13 end 14end
And tested code that looked a bit like this:
The problem I had with that test is that it knows too much about what the subject of the test is doing under the hood. It knows of every method call being made, and almost the order in which the code will run. I feel it's essentially writing the same code again and, as a result, if the code is wrong the test will pass happily anyway. It doesn't test the behaviour as much as specifics of the implementation, which are mocked and do nothing.
In this scenario, I would have written the test like follows:
This is called state verification. We set up the test so that the state of the system is A, we exercise the code (ie: run the code we are testing), and then we check that the state of the system is now B. This doesn't involve knowing what exactly happened inside.
(Unfortunately, in the specific case we were working on,
count would not return an updated count straightaway, so this test didn't work.)
Now, this doesn't mean that doubles should not be used. There are perfectly valid use cases for them and they can be a great tool. Consider the following spec:
1describe FileBrowserWithInjection do 2 describe "#files_with_extension" do 3 it "returns the files with the given extension" do 4 f1 = double(name: "file1.rb") 5 f2 = double(name: "file2.py") 6 f3 = double(name: "file3.rb") 7 files = [f1, f2, f3] 8 filesystem = double(files: files) 9 browser = FileBrowserWithInjection.new(filesystem) 10 11 result = browser.files_with_extension("rb") 12 expect(result).to match_array([f1, f3]) 13 end 14 end 15end
That's a lot of stubs, and it could feel suspicious. However I'm ok with this code because it's using dependency injection, and it's only using stubs to fake the thing that will be injected. Ultimately, it's the final result that is being checked, which still falls under state verification.
A quick break to explain Dependency Injection if you are not familiar: it means that we tell object A to use object B internally, by providing B as an argument or similar. This way, A doesn't need to know what class is B, and can be fed anything that implements a pre-arranged interface. It's particularly handy for having several implementations of the same abstract concept. For example, a notifier class (the "client") can receive a Twitter adapter or a Slack adapter (the "service"), with the same interfaces, and the notifier doesn't need to know which one it is using to send a notification.
In this example, A is
FileBrowserWithInjection, while B is something we call
filesystem and is completely faked with stubs. We use doubles to create a fake filesystem that will be used internally. We don't care about the order methods are called, which arguments they receive, etc. This could be done instead with fake classes, like follows:
1describe FileBrowserWithInjection do 2 FakeFile = Struct.new(:name) 3 FakeFilesystem = Struct.new(:files) 4 5 describe "#files_with_extension" do 6 it "returns the files with the given extension" do 7 f1 = FakeFile.new("file1.rb") 8 f2 = FakeFile.new("file2.py") 9 f3 = FakeFile.new("file3.rb") 10 filesystem = FakeFilesystem.new([f1, f2, f3]) 11 browser = FileBrowserWithInjection.new(filesystem) 12 13 result = browser.files_with_extension("rb") 14 expect(result).to match_array([f1, f3]) 15 end 16 end 17end
The doubles in the previous example just provided a handy shortcut to the fake classes used in this latest example.
Now have a look at this variant:
1describe FileBrowserWithGlobals do 2 describe "#files_with_extension" do 3 it "returns the files with the given extension" do 4 f1 = double(name: "file1.rb") 5 f2 = double(name: "file2.py") 6 f3 = double(name: "file3.rb") 7 files = [f1, f2, f3] 8 allow(FileSystem).to receive(:files).and_return(files) 9 browser = FileBrowserWithGlobals.new 10 11 result = browser.files_with_extension("rb") 12 expect(result).to match_array([f1, f3]) 13 end 14 end 15end
It's pretty much the same, but in this case the class under test (
FileBrowserWithGlobals) knows that it's using
FileSystem internally, and there's no changing that. As a result, we have to stub
allow. This means our test knows a bit more about what's going on behind the hood than in the previous example.
Often there's no escaping this. I don't mean to say that we should use dependency injection everywhere. For example, you'll have code that calls ActiveRecord models directly, and that's fine. Dependency Injection is just another useful tool in your belt.
As for testing that things have been called, perhaps with specific arguments and in a specific order, that can be useful too sometimes. For example when testing a cache. Here's an example:
Using state verification, we can't tell easily if the source was hit once or twice. It's much easier to use a mock (a spy here actually!) and verify it at the end.
(OK, you could have a fake class that behaves as the source, and counts the number of hits to
foobar, and do state verification on that; but that would be more work).
Having said all that, there's still a school of thought that approves of doing more mocking than I normally like. What I'm explaining here is not gospel, but it's the sort of things we should be thinking about when using these tools, and the trade-offs to consider.