Fast test, slow test by Gary Bernhardt from destroyallsoftware.com

Presenter: Gary Bernhardt (http://blog.extracheese.org/ / https://www.destroyallsoftware.com/) (@garybernhardt)

PyCon 2012 presentation page: https://us.pycon.org/2012/schedule/presentation/429/

Slides: ???

Video: http://pyvideo.org/video/631/fast-test-slow-test

Video running time: 31:50

Goals of tests

  1. Prevent regressions

The weakest of the goals. Doesn’t change the way you build the software minute to minute. At best it changes how you release the software. You don’t release broken things.

  1. Prevent fear (00:19)

Prevent fear minute to minute or second to second, so speed is important here. Enable refactoring.

  1. Prevent bad design (00:52)

Very subtle topic. Sort of out of bounds. The holy grail of testing.

(01:32) Video of a large system test running.

(02:06) A smaller, synthetic example to dissect for this talk – test for a Django app; a discussion board

Writing tests from the bottom up is often easier

def test_that_spam_posts_are_hidden(self):
    assert 'Spammy!" not in resp.content
def test_that_spam_posts_are_hidden(self):
    resp = self.client.get("/discussion/%s" % disc.pk)
    assert 'Spammy!" not in resp.content
def test_that_spam_posts_are_hidden(self):
    self.client.post("/mark_post_as_spam", {'post_id': post.pk})
    resp = self.client.get("/discussion/%s" % disc.pk)
    assert 'Spammy!" not in resp.content
def test_that_spam_posts_are_hidden(self):
    log_in(alice)
    self.client.post("/mark_post_as_spam", {'post_id': post.pk})
    resp = self.client.get("/discussion/%s" % disc.pk)
    assert 'Spammy!" not in resp.content
def test_that_spam_posts_are_hidden(self):

    disc = Discussion()

    disc.save()
    log_in(alice)
    self.client.post("/mark_post_as_spam", {'post_id': post.pk})
    resp = self.client.get("/discussion/%s" % disc.pk)
    assert 'Spammy!" not in resp.content
def test_that_spam_posts_are_hidden(self):

    disc = Discussion()
    disc.posts.append(Post(poster=bob, "Spammy!"))
    disc.save()
    log_in(alice)
    self.client.post("/mark_post_as_spam", {'post_id': post.pk})
    resp = self.client.get("/discussion/%s" % disc.pk)
    assert 'Spammy!" not in resp.content
def test_that_spam_posts_are_hidden(self):
    alice, bob = User(admin=True), User(admin=False)
    disc = Discussion()
    disc.posts.append(Post(poster=bob, "Spammy!"))
    disc.save()
    log_in(alice)
    self.client.post("/mark_post_as_spam", {'post_id': post.pk})
    resp = self.client.get("/discussion/%s" % disc.pk)
    assert 'Spammy!" not in resp.content

(03:47) Why is this a system test?

What does it depend on? What thing in this test could cause it to break?

def test_that_spam_posts_are_hidden(self):
    alice, bob = User(admin=True), User(admin=False)
                      FLAG
    disc = Discussion()
                      SIGNATURE
    disc.posts.append(Post(poster=bob, "Spammy!"))
         RELATIONSHIP      SIGNATURE
    disc.save()
         VALIDITY
    log_in(alice)
    AUTH
    self.client.post("/mark_post_as_spam", {'post_id': post.pk})
                     URL, SIGNATURE, PRECONDITIONS
    resp = self.client.get("/discussion/%s" % disc.pk)
                       URL, SIGNATURE, PRECONDITIONS
    assert 'Spammy!" not in resp.content
           REPRESENTATION (E.G., NOT AJAX)

Note that:

assert 'Spammy!" not in resp.content

is a negative assertion, which is dangerous. If we change the view to render a skeleton and then fill in the details later with AJAX requests, this assertion will always succeed. This is one of the dangers of negative assertions.

(05:59) We are also dependent on:

  • Django test client
  • Django router
  • Django request object
  • Django response object
  • Third party middleware (!!!)
  • App middleware (!!!)
  • Context managers (!!!)

(06:41) The result of these dependencies is that we end up with a binary test suite - tells you whether or not your code is broken but gives no clues to what. Good tests show you exactly what’s broken.

(07:10) Test fragility – “Every time we change the code, we have to update all the tests!”

(07:32) We primarily get regression protection (and only specific kinds; the layers integrating incorrectly)

It’s very difficult to test fine-grained edge cases from the outside.

It’s not fast so it won’t help with refactoring.

No feedback on design since we’re not interacting with the smaller objects.

System tests have value but also have problems.

How To Fail

(08:29) 3 ways to fail:

  1. Selenium as primary testing tool
  2. “Unit tests” are too big
  3. Fine-grained tests around legacy code

Selenium as primary testing

(08:34)

  • Tests can’t be run locally
  • Tests too slow
  • Tests break often
  • No fine-grained feedback

“Unit tests” are too big

(09:20)

Testing time tends to grow super-linearly.

100ms = 240,000,000 instructions

Fine-grained tests around legacy code

(10:44) Fine-grained tests around legcy code – a way to fail – tight tests solidify the interface and bake all of the badness in. :-)

Unit tests

(11:20) Unit tests - what are they and why do we care?

Showed two videos of very fast test suites.

(12:29) We will test at the model layer instead of the view layer.

def test_that_spam_posts_are_hidden(self):
    post = Post(mark_post_as_spam=True)
    discussion = Discussion(posts=[post])
    assert discussion.visible_posts == []

This is a complete test at the model layer. This does not replace system tests; does not test views.

(13:18) Why is this a unit test?

(13:26) 1. Unit tests test only one object behavior.

(13:59) 2. Other classes can’t break it.

And in particular, no dependencies on:

  • Django test client
  • Django router
  • Django request object
  • Django response object
  • Third party middleware (!!!)
  • App middleware (!!!)
  • Context managers (!!!)

What are the advantages of unit tests?

(15:12) Test failures are much more isolated and tell you which object or method is broken.

(15:26) Tests are much faster. You can avoid fear. You can refactor. The difference between 400 milliseconds and 40 seconds – at 40 seconds, you can’t do the thing called TDD.

(15:40) System tests test the boundaries better than unit tests. Unit tests test the fine-grained behavior of individual objects, which is most of the behavior of your system, hopefully.

(16:10) Unit tests enable refactoring and let you avoid fear.

(16:24) Gives you design feedback. I have conditioned myself to be repulsed by an 8 line test for a model. Why do I need to set up so much of the world to test this one small piece of behavior? Makes me think about refactoring, which leads to better system design.

(16:46) Guidelines for ratio of unit tests to system tests – 90% unit tests, 10% system/acceptance tests

(17:19) I have not mentioned test doubles or mocking. You may not these when testing the low levels like models. You may need them when testing higher level objects like views.

The End

(18:00)

@garybernhardt

destroyallsoftware.com

Screencasts for Serious Developers

  • OO design
  • Unix
  • TDD
  • Smaller, faster tests

Questions

(18:26) Q: 90/10 ratio - was that in time or lines of code or what?

A: Number of tests

(18:50) Q: Does 90/10 apply to every kind of project or does it vary?

(19:04) A: That applies mostly to object-heavy systems like web apps, that have lots of logic and not a lot of boundaries.

(19:44) Question (from Carl Meyer): Pain point in unit testing Django apps is the database. Slows down your tests. Django models objects are very tied to the database. Trying to mock out the persistence layer seems like a way to fail.

(20:36) Answer: Should you mock the model objects in a Django app? No it’s too wide of a boundary that you don’t control. A better approach is to create a service layer that interacts with the model objects and then mock that service layer.

(22:00) Question: How do you enforce that mock objects have the same behavior as the real object?

(22:19) Answer: System tests. Or in Ruby, rspec-fire from Xavier Shay

(23:36) Mock by Michael Foord does interface checks.

(24:32) Question: Why is it such a bad idea to unit test legacy code?

(24:37) Answer: It is good to test unit test legacy code. It’s not good to write fine-grained tests for legacy code, because it solidifies the edges. I may be the worst offender, because my mocking library Dingus can magically mock everything on the outside layer of your class, so if you’re doing this, stop it. :-) You want to read “Working Effectively With Legacy Code” by Michael Feathers.

(26:05) Integration tests == system tests?

(26:09) Answer: Oops, sorry. Main distinct is unit test (which tests one thing) vs. any kind of integrated test that tests multiple things.

(27:39) Question: When is Selenium appropriate?

(27:55) Answer: Selenium is not evil. If you use Selenium to test everything and especially fine-grained behavior, that’s where you’ll run into problems.

(28:20) He mostly works in Ruby these days and uses Cucumber with Capybara driving a headless WebKit browser.

(28:38) Don’t pay non-programmers to build large Selenium test suites.

(29:15) Question: How do I convert a system test suite to unit tests and make sure that I’m covering everything the system tests covered?

(29:25) Answer: Ask Michael Feathers? :-) Sometimes it’s obvious...