Rebuilding vs Refactoring and the dangers that come with it
This post covers a situation that most technology teams will face sooner rather than later in their existence: the decision about what to do with a codebase that’s got to the stage where making changes is destabilising the system so seriously and taking so long, that their ability to keep up with their customer’s requirements has reduced to zero. I’ll focus on the pros and cons of two solutions commonly considered: a significant refactor, and a total rebuild.
First, let’s take the rebuild option. It’s usually the one most popular with developers. On the plus side it’s a fresh start and thus very motivational, is a chance to correct mistakes fundamental to the architecture, a chance to select new technologies, and a chance to add new features or remove old ones. All but the first however carry significant risk. Large architectural changes are inherently risky due to unknown effects on scalability, maintainability, etc. New technologies are risky due to possible lack of knowledge in the team or a picking something at the untested bleeding edge. Adding more features adds time and again adds risk due to a more complex end result, and removal of features risks upsetting clients who thought they didn’t mind missing a feature until it’s actually gone (or simply weren’t consulted).
A rebuild will inevitably take longer than planned due to the bias of the team through knowing the existing product well. During that period, they also need to maintain the existing codebase and make tough decisions on which if any new features to add in to the new codebase. Finally, running a team to maintain, plus a team to rebuild is very expensive with no reduction or payoff until the work is completely done. Having said all of this, rebuilding should be seriously considered if it’s either part of a deliberate strategy to deal with a fast-growing user-base requiring a new approach to scalability every year or two (Facebook did this very successfully), or if the product is in the very early stages of customer adoption and the codebase is still small and running at low volumes.
Looking at the refactor option, on the plus side, with a well-managed process it’s perfectly possible to continue to service client requests and do the refactoring on the same codebase – for example using alternate sprints where one sprint is refactoring and one is client requests. This is a huge advantage as it ensures the refactored product stays up to date. Refactoring is also inherently less risky as you’re dealing with the known however bad that may be. Furthermore, it still allows you to add and remove features, and also allows architectural changes albeit with restrictions on how sweeping they can be.
All of this may sound positive but the downside of refactoring is that if the system is really broken badly then it may not be enough to fix it and worst of all you may not find that out until after you’ve released a few new versions and find little improvement. Also refactoring is easy for inexperienced programmers to get wrong (e.g. the temptation to rewrite entire functions rather than do it in small testable pieces). There are many good books that can help on the subject e.g. “Refactoring: Improving the Design of Existing Code” by Martin Fowler & Kent Beck, or “Clean Code: A Handbook of Agile Software Craftsmanship” by Robert C. Martin. These books help with the challenge of doing work into small enough chunks to achieve testable results in one sprint.
Quote: Harry Doyle