Blind Refactoring, Guided by Tests

Posted on: February 7, 2013

“How can you write code if you don’t understand what you’re doing?”, my wife asked me. It was a reasonable question, and I had a hard time answering it. But I had just discovered, to my surprise, that I could.

I’d been working on a very important piece of code for my company. It dealt with some complex financial issues, and it was difficult for me to grasp.

Because I wanted to be very sure my code would be correct, I started by working through every possible circumstance under which the code could be called. With help from a coworker, I drew a branching tree of possibilities until I was satisfied that we had covered them all.

Next, I wrote tests to cover all of those scenarios. Together, we were able to answer, in each scenario, what the code should do. With these tests, though I couldn’t see the big picture, I had a target to hit: make these tests pass, one at a time.

So I started coding, blindly. I’d write a bit of code, get a test passing, and move to the next test. Since I couldn’t understand the overall problem, I just wrote code that parallelled the structure of my tests: a giant pile of conditionals.

Eventually, the tests all passed. But egads, the code! It made my eyes bleed! Ignoring the method and variable names, the structure was something like this:

if thing 

  if thing > thing

    if thing > 0
      self.thing += thing

    elsif thing < 0
      self.thing -= thing
    end

  elsif thing == thing

    if thing > 0
      self.thing += thing

    elsif thing < 0
      self.thing += thing
    end

  elsif thing < thing

    if thing == 0

      if thing > 0
        self.thing += thing

      elsif thing < 0
        if thing.abs > thing
          self.thing += thing
          self.thing = thing
          self.thing = 0
        else
          self.thing += thing
        end
      end

    else
      if thing > 0
        self.thing += thing
      else
        self.thing += thing
      end

    end

  end
else
  self.thing += thing
end

This was not acceptable. I wanted my code not just to solve the problem, but to explain it to the reader – and that included me! There were a few comments, but they didn’t clarify much, because I didn’t understand the code myself.

All I knew was that, after carefully reasoning through every use case and writing a test for each one, I had written code that was correct. My test suite passed.

Now I wanted to improve the code, so I started refactoring. I was still walking blindly, but I had the tests to guide me. I made one tiny change at a time, backing up if the tests ever failed. Ever so slowly, the code shrank: fewer conditionals, less duplication.

More than once, I thought I saw where I was going and tried to skip to the answer, but each time, I broke the tests, so I went back to groping.

Finally, I came to a point where the code was so short that I actually could reason through it. And something looked fishy: there was a conditional that didn’t make sense. The tests failed if I removed it, but now that I could see the business logic behind it, it seemed wrong.

So I went back to the whiteboard and graphed that part of the logic again. I stared, I reasoned, and I talked with my colleague again. Sure enough, one of my tests was wrong! I fixed it, and I finished my refactoring.

When I was done, I had something like this:

if thing.to_i == 0
  self.thing += thing

  do_something if thing < 0

else
  self.thing += thing
end

Finally, I could see what was really happening! Looking at this code, I could articulate the principles that drove all the test cases we had written. I could understand what was happening well enough to give the methods and variables very clear names, and I could explain why the code did what it did in the comments.

If I were smarter, I might have gotten there by reasoning. But I didn’t. I got there by blind refactoring.

How can you write code if you don’t understand what you’re doing? With a thorough test suite, you can. Very, very, slowly.