by

To mock, or not to mock

During a mentoring session recently, I had a conversation about using mocks in unit tests: when and how they are appropriate, when I feel they enhance tests, and what I don't like about them. Here are my thoughts in writing for future reference.

First, I use several mocking-related terms here, and I go by George Meszaros's definitions for those words. Here they are in a snappier form as they appear on Martin Fowler's article on this very same topic (which is a recommended read on this subject):

Meszaros uses the term Test Double as the generic term for any kind of pretend object used in place of a real object for testing purposes (…) [He] defined five particular kinds of double:

Once that's out of the way, let's get to the code that was subject of the conversation. It included a test that I wasn't convinced about, and looked a bit like this:

 1describe QueueFlushWorker do
 2  it "clears half of the dead jobs" do
 3    some_dead_jobs_stub = double(clear!: true)
 4    all_dead_jobs_stub = double(first: some_dead_jobs_stub)
 5    dead_queue_stub = double(size: 4, jobs: all_dead_jobs_stub)
 6    allow(Queue).to receive(:read).and_return(dead_queue_stub)
 7
 8    worker = QueueFlushWorker.new
 9    worker.process
10
11    expect(all_dead_jobs_stub).to have_received(:first).with(2)
12    expect(some_dead_jobs_stub).to have_received(:clear!)
13  end
14end

And tested code that looked a bit like this:

1class QueueFlushWorker
2  def process
3    dead_queue = Queue.read(:dead)
4    num_dead = dead_queue.size
5    dead_jobs = dead_queue.jobs.first(num_dead / 2)
6    dead_jobs.clear!
7  end
8end

The problem I had with that test is that it knows too much about what the subject of the test is doing under the hood. It knows of every method call being made, and almost the order in which the code will run. I feel it's essentially writing the same code again and, as a result, if the code is wrong the test will pass happily anyway. It doesn't test the behaviour as much as specifics of the implementation, which are mocked and do nothing.

In this scenario, I would have written the test like follows:

 1describe QueueFlushWorker do
 2  it "clears half the dead jobs" do
 3    4.times.each { Queue::Dead.add(Queue::Job.new) }
 4    expect(Queue::Dead.count).to eq(4)
 5
 6    worker = QueueFlushWorker.new
 7    worker.process
 8
 9    expect(Queue::Dead.count).to eq(2)
10  end
11end

This is called state verification. We set up the test so that the state of the system is A, we exercise the code (ie: run the code we are testing), and then we check that the state of the system is now B. This doesn't involve knowing what exactly happened inside.

(Unfortunately, in the specific case we were working on, count would not return an updated count straightaway, so this test didn't work.)

Now, this doesn't mean that doubles should not be used. There are perfectly valid use cases for them and they can be a great tool. Consider the following spec:

 1describe FileBrowserWithInjection do
 2  describe "#files_with_extension" do
 3    it "returns the files with the given extension" do
 4      f1 = double(name: "file1.rb")
 5      f2 = double(name: "file2.py")
 6      f3 = double(name: "file3.rb")
 7      files = [f1, f2, f3]
 8      filesystem = double(files: files)
 9      browser = FileBrowserWithInjection.new(filesystem)
10
11      result = browser.files_with_extension("rb")
12      expect(result).to match_array([f1, f3])
13    end
14  end
15end

That's a lot of stubs, and it could feel suspicious. However I'm ok with this code because it's using dependency injection, and it's only using stubs to fake the thing that will be injected. Ultimately, it's the final result that is being checked, which still falls under state verification.

A quick break to explain Dependency Injection if you are not familiar: it means that we tell object A to use object B internally, by providing B as an argument or similar. This way, A doesn't need to know what class is B, and can be fed anything that implements a pre-arranged interface. It's particularly handy for having several implementations of the same abstract concept. For example, a notifier class (the "client") can receive a Twitter adapter or a Slack adapter (the "service"), with the same interfaces, and the notifier doesn't need to know which one it is using to send a notification.

In this example, A is FileBrowserWithInjection, while B is something we call filesystem and is completely faked with stubs. We use doubles to create a fake filesystem that will be used internally. We don't care about the order methods are called, which arguments they receive, etc. This could be done instead with fake classes, like follows:

 1describe FileBrowserWithInjection do
 2  FakeFile = Struct.new(:name)
 3  FakeFilesystem = Struct.new(:files)
 4
 5  describe "#files_with_extension" do
 6    it "returns the files with the given extension" do
 7      f1 = FakeFile.new("file1.rb")
 8      f2 = FakeFile.new("file2.py")
 9      f3 = FakeFile.new("file3.rb")
10      filesystem = FakeFilesystem.new([f1, f2, f3])
11      browser = FileBrowserWithInjection.new(filesystem)
12
13      result = browser.files_with_extension("rb")
14      expect(result).to match_array([f1, f3])
15    end
16  end
17end

The doubles in the previous example just provided a handy shortcut to the fake classes used in this latest example.

Now have a look at this variant:

 1describe FileBrowserWithGlobals do
 2  describe "#files_with_extension" do
 3    it "returns the files with the given extension" do
 4      f1 = double(name: "file1.rb")
 5      f2 = double(name: "file2.py")
 6      f3 = double(name: "file3.rb")
 7      files = [f1, f2, f3]
 8      allow(FileSystem).to receive(:files).and_return(files)
 9      browser = FileBrowserWithGlobals.new
10
11      result = browser.files_with_extension("rb")
12      expect(result).to match_array([f1, f3])
13    end
14  end
15end

It's pretty much the same, but in this case the class under test (FileBrowserWithGlobals) knows that it's using FileSystem internally, and there's no changing that. As a result, we have to stub FileSystem with allow. This means our test knows a bit more about what's going on behind the hood than in the previous example.

Often there's no escaping this. I don't mean to say that we should use dependency injection everywhere. For example, you'll have code that calls ActiveRecord models directly, and that's fine. Dependency Injection is just another useful tool in your belt.

As for testing that things have been called, perhaps with specific arguments and in a specific order, that can be useful too sometimes. For example when testing a cache. Here's an example:

 1describe MemoryCache do
 2  it "only accesses the source once" do
 3    source = double(foobar: nil)
 4    cache = MemoryCache.new(source)
 5
 6    cache.read(:foobar)
 7    cache.read(:foobar)
 8
 9    expect(source).to have_received(:foobar).once
10  end
11end

Using state verification, we can't tell easily if the source was hit once or twice. It's much easier to use a mock (a spy here actually!) and verify it at the end.

(OK, you could have a fake class that behaves as the source, and counts the number of hits to foobar, and do state verification on that; but that would be more work).

Having said all that, there's still a school of thought that approves of doing more mocking than I normally like. What I'm explaining here is not gospel, but it's the sort of things we should be thinking about when using these tools, and the trade-offs to consider.