Refactoring: Replace Side Effects with Return Values

Currently at Gusto, we’re undertaking a sizable refactoring to pay off some tech debt around a core component of our payroll system. Running a payroll is one of the more complicated things software can do: it’s a combination of time, money, and geographic logic.

We use a few different subsystems to handle this, and one in particular is a service that you load up with state through method calls, then query it for results. We use vendors for certain components of the system, and we don’t have much say in how they design their software. In the meantime, we are trying to make our own system more pure and functional.

In any system with synchronous and asynchronous operations, we can change their appearance with a layer of abstraction. We can make something synchronous feel asynchronous with a little window dressing. Vice-versa we can always make an asynchronous operation feel like a synchronous, blocking call.

We can also achieve something similar with imperative and functional blocks of code. We can make imperative code feel functional with a layer of abstraction.¹ Should we want to, we can also go in the opposite direction and make functional code feel imperative.

Pure as the Test-Driven Snow

Let’s dive in to how the code was when we started, and where we ended up. Here’s how the code originally looked:

class PayrollRunner
  def run!
    federal_info = build_federal_info()
    state_info = build_state_info()

    tax_engine.set_federal_info(federal_info)
    tax_engine.set_state_info(state_info)

    computed_taxes = tax_engine.calculate_taxes
    payroll.assign_taxes(computed_taxes)
  end
end

Not too bad, but this is also a highly simplified version of the class. The code itself was very imperative and order-dependent. It was difficult to make a change and be sure that you were not clobbering some state that a subsequent method call relied on.

So we switched this to collect the methods to call against tax_engine and apply a variant of the Command pattern:

class PayrollRunner
  def run!
    federal_info = build_federal_info()
    state_info = build_state_info()

    commands = [
      SetFederalInfoCommand.new(federal_info),
      SetStateInfoCommand.new(state_info),
    ]

    TaxEngineWrapper.new(tax_engine).apply(commands)

    computed_taxes = tax_engine.calculate_taxes
    payroll.assign_taxes(computed_taxes)
  end
end

# This wrapper is designed to not change the API of something
# we don't control (the original tax engine), but allows
# us to apply our custom commands as normal method calls.
class TaxEngineWrapper
  def initialize(tax_engine)
    @tax_engine = tax_engine
  end

  def apply(commands)
    commands.each do |command|
      @tax_engine.public_send(
        command.method_name, *command.arguments
      )
    end
  end
end

We increased the number of lines of code doing this, but we also set the stage for something important: we separated defining the data requirements for these imperative methods calls against tax_engine and actually calling those methods.² Separating the data construction from the imperative bits in the code can greatly increase its testability and readability.

But we hit a wrinkle when doing this. The imperative calls against tax_engine spidered out into hundreds of classes. Our hopes and dreams of converting this method into a pure function within our original estimates were dashed: We had discovered that there was a lot more work than we thought with this refactoring.

Here’s something closer to how the code actually looked:

class PayrollRunner
  def run!
    federal_info = build_federal_info()
    state_info = build_state_info()

    tax_engine.set_federal_info(federal_info)

    # With 50 states plus Washington, D.C., there are a lot of
    # classes on the other end of this factory.
    #
    # All of them expect to be interacting with the `tax_engine`
    # in an imperative fashion.
    TaxEngineSetter::Factory.for(state).apply_state_info(
      tax_engine,
      state_info
    )

    computed_taxes = tax_engine.calculate_taxes
    payroll.assign_taxes(computed_taxes)
  end
end

So converting everything under the factory to be a pure function that returned commands would have likely taken months.³

So we reached for a concept commonly seen in our collaboration tests: a spy object. We wanted to know every method called on tax_engine and be able to replay those in order. This strategy allowed us to take some imperative code with side effects sprinkled throughout many files, and shoe horn it into our functional world in a safe way.

Here’s how the code looked when we finished up:

class PayrollRunner
  def run!
    TaxEngineWrapper.new(tax_engine).apply(commands)
    computed_taxes = tax_engine.calculate_taxes
    payroll.assign_taxes(computed_taxes)
  end

  # This is now a pure function from the view of its callers.
  # Sweet!
  def commands
    tax_engine_spy = TaxEngineSpy.new

    # Commands are recorded against our tax_engine_spy,
    # which pretends to be a perfectly normal `tax_engine`.
    TaxEngineSetter::Factory.for(state).apply_state_info(
      tax_engine_spy,
      build_state_info()
    )

    [SetFederalInfoCommand.new(build_federal_info())] +
      tax_engine_spy.generated_commands
  end
end

class TaxEngineSpy
  def initialize
    @commands = []
  end

  # We did not end up using `method_missing` in on our code,
  # but rather listening for each specific method.
  # `method_missing` is difficult to test and didn’t provide
  # the level of safety we were looking for.
  def method_missing(name, *args)
    @commands << Command.new(name, *args)
  end

  def generated_commands
    @commands
  end
end

The resulting code turns imperative, side-effect heavy code into a more functional interface. This greatly helped in our own experience, because generating these commands requires many, many classes. Rather than pass around a tax_engine and hope the correct side-effects are inflicted, we instead ask each class or method to return the commands it would like to apply.

Testing these classes and methods in much easier. Furthermore, any order-dependency in the commands is now able to be inspected at a single point: right before we apply the commands. A normal payroll will accumulate dozens of commands, and just missing a single command results in an unfriendly letter from the IRS for one of our customers.

Because tax code constantly changes, so too does our computer code. When making a change, we need to be sure that our generated commands change in the exact places we expect and nowhere else. The inspection of these instructions is crucial to operating correct payroll software.

This strategy also gives us an effective bridge to building more functional code. We know that our TaxEngineSpy should be a “long-term temporary solution.” Over time, we should make the code it plays with more functional. Eventually, we can delete the TaxEngineSpy entirely.

This was a simple little strategy to build a bridge toward more functional, pure, and delcarative code. After searching around a bit, I couldn’t find a blog post that covered this topic directly so I decided to put this together. This strategy is most easily applied in dynamic languages with powerful metaprogramming features, like Ruby.

Special thanks to Iheanyi Ekechukwu, Eddie Kim, and Justin Duke for reading early drafts of this post.

Even in functional programming languages, our instructions must eventually be translated to assembly instructions: imperative by definition. ↩
Also interesting is that the “gross” parts (the metaprogramming and the imperative calls) are now grouped together. ↩
We still may undertake this refactoring, but at a later date. For now, we have a safe boundary built around the code and we know all of the side effects. ↩