Introduction – The extended definition of Refactoring contains also this part: “Its heart is a series of small behavior preserving transformations. Each transformation (called a “refactoring”) does little, but a sequence of transformations can produce a significant restructuring. Since each refactoring is small, it’s less likely to go wrong. The system is kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring.”- Martin Fowler, at refactoring.com.
Note: this imaginary dialog it’s inspired from an intensive and extensive practice.
What if we will have to perform a Big Refactoring?
Refactoring cannot be big, it is a special kind of redesign, performed in small steps. See the above definition again.
Reformulate: I need to do a lot of refactorings to clean a legacy code. There is any best practice?
You need a strategy?
Well, agile and software engineering offer a lot of tactics, including the Martin Fowler refactoring sets but almost no strategy, except … you should work clean from the first time. Use Martin Fowler indications from the start (or early enough), also Uncle Bob Clean Code and Clean Architecture.
Hey! I already have a lot of legacy code debt to solve!
Ok! Let’s build a strategy: how do we start?
This was my question!
Refactoring supposed improving the design, while preserving the functionality. Tests included. Do you have good requirements specifications or a good set of automated tests?
Not in this case.
Then you should recover functionality knowledge from the code and put/perform incrementally some tests. Better: functionality should be explicitly represented in the code and should be testable. And remember: there are two kinds of functionality…
Yes, first, the one that is application-independent and represent the target domain (domain business rules) and the one that is application-depend aka flow of control.
I remember: that sound like Uncle Bob Clean Architecture.
Yes. You will need to be able to apply distinct tests to them, without mix them with other concerns such as UI, persistence, network and others. Anyway, where do I usually start? I will try to make the running scenarios very clear into the code and that mean the flow of control.
In English, please?
I want to clearly see these: were the event triggered by the system actors start and the end-to-end full path until return. More, I want to refactor to make this path clear enough.
How could be not clear?
Global context. If the functionality path chaotically accesses the global context, then we could have undesired intersections with other paths/scenarios, that will compromise both data and function. In the same time, we can decouple flow/orchestration from specialized concerns.
What we get?
We will have explicit representation of the functionality (with no undesired contacts with other flows), needed for tests (we can apply auto-tests on it). Also we will have the first entry points to the specialized parts that also could be <decorated> with some tests. Then we can apply tactical refactoring as we need.
And …the domain business rules?
Must be decoupled from other concerns and you have to dedicate them specialized design/test elements.
Almost. You need to test any redesign. Tests need knowledge about functionality. If some parts are missing, now it is the time to recover them in auto-tests (preferable) or in other form of specification.
How do I know that recovered requirements are correct?
You don’t. More, you should always suspect that spaghetti-like legacy code include many unobserved bugs. You should validate these functional requirements by intensive collaboration with your colleagues, with domain experts, customer and other stakeholders.
Do you have any idea about how to do that?
Start with Pair Programming (refactor in pairs). Pairing is not enough, and you will probably need more people involved – use Model Storming: discuss the resulted functionality with more colleagues.
Yes, it is an agile practice, part of Agile Modeling (and Disciplined Agile) and it was created to complement core practices from XP. Also, you should actively involve your stakeholders in validating the recovered functionality…. Active Stakeholder Participation, that it is another Agile Modeling recommended practices. And at the end you will have more free bonuses.
Functionality it is easy to accurately read from code (seconds!) and your colleagues and your stakeholders will already have acquired the recovered functional knowledge.
Summary – Refactoring for significant spaghetti legacy code need tests/testing. Usually, knowledge about functionality necessary for testing it is insufficient, so must be recovered from the code. An effective & proven way to do that is to apply Clean Architecture principles: decuple both domain rules and application specific flow of control (aka use cases). Anyway, legacy code with too much technical debt will contain a lot of bugs, so recovered functionality it is inaccurate and need to be validated. Knowledge & expertise needed for validation it is distributed among team members, domain experts, customers and other stakeholders, so you need to work in a collaborative manner with all mentioned parts. There are some outstanding software engineering and agile practices that could help on this aspect:
Note: “need” and “necessary” are often use in above text, just because we have followed the logical path of necessary things for testing a redesigned legacy code.
Remember: A lot of technical debt ~ inaccurate functionality. To refactor & test, you must re-start the process & collaborative work from functional requirements acquisition.