Wednesday, April 9, 2014

Are enumerations evil?

Or, How to refactor an enumeration to multiple classes

I ran across this article about how to refactor a switch statement to a data structure in JavaScript a while ago. This is not a terrible refactoring technique. I've used something similar with dictionaries in C# where the key was an enumeration and the value was an Action (or Func). It can be a minor refactoring to a single method that can help clean-up the code. Then this week I had a conversation with a fellow developer about the appropriateness of an enumeration in existing code from a couple different libraries he was reading.

This led me to wonder afresh if perhaps enumerated types aren't something that we as the development community should treat as an anti-pattern. In the days before class types, they were a means for the compiler to force integers to known values and provide more readable code for the developer. The compiler could check values better at compile time and catch usage errors earlier. I have worked on languages without them and embraced them when they became available. At that point they were a good thing.

Many times enumerations are used to change behavior of some code based on a variable's value. This is usually done through switch or if-then-else statements. Or, as discussed above, using some sort of dictionary like structure to store behavior associated with a specific enumerated value.

The problem is this pattern tends to get replicated within the class containing the enumerated variable. The class using the enumeration needs to do different things based on the various values the variable can have and these different things spread throughout the class (or classes) using the enumeration. Then, when a value is added to the enumeration, each place the variable is tested needs to be updated to handle the new value.

This can lead to a number of problems. With behavior based on the enum scattered throughout classes, the intent can be both obfuscated and duplicated. When adding a value to the enumeration, it's easy to miss a place that needs new behavior, introducing hard to detect bugs. The code is fragile in the face of changes. And because the state and associated behavior associated with the enumeration is mixed in with the class (or classes) using it, violation of the Single Responsibility Principle frequently is seen.

Refactoring to a data structure as mentioned above is a good first step. It helps address a number of the problems. But in today's object oriented world, there is a better way. In many (perhaps most) cases it's better to refactor to multiple classes, one for each value in the enumeration. Fortunately, this is fairly easily done.

Here's the original starting code:
enum SomeItemEnum = { one, two, three };
class A
{
  SomeItemEnum someItemValue;
  void SomeItemUser()
  {
    switch(someItemValue)
    {
      case one:
        // Some complicated code for case one
        break;
      case two:
        // Some complicated code for case two
        break;
      case three:
        // Some complicated code for case three
        break;
    }
  }
}
First, a base class is created as an ancestor for the enumeration.
enum SomeItemEnum = { one, two, three };

class SomeItemEnumBase
{
}

class A
{
  SomeItemEnum someItemValue;
  void SomeItemUser()
  {
    switch(someItemValue)
    {
      case one:
        // Some complicated code for case one
        break;
      case two:
        // Some complicated code for case two
        break;
      case three:
        // Some complicated code for case three
        break;
    }
  }
}
Then, each place where something needs to be done based on the value of the enumeration, a method is added to the class. If there is no default behavior for a method, it should be abstract, otherwise it should be virtual.
enum SomeItemEnum = { one, two, three };

abstract class SomeItemEnumBase
{
  abstract void ComplicatedCode();
}

class A
{
  SomeItemEnum someItemValue;
  void SomeItemUser()
  {
    switch(someItemValue)
    {
      case one:
        // Some complicated code for case one
        break;
      case two:
        // Some complicated code for case two
        break;
      case three:
        // Some complicated code for case three
        break;
    }
  }
}
A child class should be made for each value of the enumeration with the value specific behavior moved to the appropriate overridden method.
enum SomeItemEnum = { one, two, three };

abstract class SomeItemEnumBase
{
  abstract void ComplicatedCode();
}

class SomeItemEnumOne : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case one
  }
}

class SomeItemEnumTwo : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case two
  }
}

class SomeItemEnumThree : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case three
  }
}

class A
{
  SomeItemEnum someItemValue;
  void SomeItemUser()
  {
    switch(someItemValue)
    {
      case one:
        // Some complicated code for case one
        break;
      case two:
        // Some complicated code for case two
        break;
      case three:
        // Some complicated code for case three
        break;
    }
  }
}
Change the type of the class' variable from the enumeration to the base type.
enum SomeItemEnum = { one, two, three };

abstract class SomeItemEnumBase
{
  abstract void ComplicatedCode();
}

class SomeItemEnumOne : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case one
  }
}

class SomeItemEnumTwo : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case two
  }
}

class SomeItemEnumThree : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case three
  }
}

class A
{
  SomeItemEnumBase someItemValue;
  void SomeItemUser()
  {
    switch(someItemValue)
    {
      case one:
        // Some complicated code for case one
        break;
      case two:
        // Some complicated code for case two
        break;
      case three:
        // Some complicated code for case three
        break;
    }
  }
}
Instead of setting discrete values on the enumerated variable, now a class instance of the appropriate type can be set.

Old method:
someItemValue = SomeItemEnum.one;
someItemValue = SomeItemEnum.two;
New method:
someItemValue = new SomeItemEnumOne();
someItemValue = new SomeItemEnumTwo();
All the switch/if-else statements can now be changed to simple method calls that have polymorphic behavior based on the class type.
enum SomeItemEnum = { one, two, three };

abstract class SomeItemBase
{
  abstract void ComplicatedCode();
}

class SomeItemOne : SomeItemBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case one
  }
}

class SomeItemTwo : SomeItemBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case two
  }
}

class SomeItemThree : SomeItemBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case three
  }
}

class A
{
  SomeItemEnumBase someItemValue;
  void SomeItemUser()
  {
    someItemValue.ComplicatedCode();
  }
}
And finally, the unused enumeration declaration can be removed.
enum SomeItemEnum = { one, two, three };

abstract class SomeItemEnumBase
{
  abstract void ComplicatedCode();
}

class SomeItemEnumOne : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case one
  }
}

class SomeItemEnumTwo : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case two
  }
}

class SomeItemEnumThree : SomeItemEnumBase
{
  override void ComplicatedCode()
  {
    // Some complicated code for case three
  }
}

class A
{
  SomeItemEnumBase someItemValue;
  void SomeItemUser()
  {
    someItemValue.ComplicatedCode();
  }
}
Now, when a new value is added, only places that actually care about the new value specifically (e.g. where its value is set) need to be touched. As a bonus, the compiler will complain about any unimplemented abstract functions, ensuring all required behavior is implemented.

Finally, admittedly, in this toy example, the final result is more complicated and obtuse than the original. If production code is as simple as this example, then it doesn't make sense to make this change. However, when class A is larger and more complex, the increased clarity and robustness of the code can be significant. In my experience, the latter is more typical than the former and so in general, my conclusion is enumerations border on evil.