Make Your Code Agile: Refactoring
July 2, 2009 10:30 AMFirst, my definition of refactoring:
Refactoring is improving code without changing the features it implements.
That's all.
If you're refactoring, you're not fixing bugs, you're not improving performance and you not increasing robustness. Refactoring is simply improving the design of the code, while ensuring that it still works the same, warts and all.
Pointy haired bosses the world over froth at the mouth to hear such things. Veins pop out in their temples. You mean the business value of the software stays the same but the cost to the business goes up? I didn't say that.
If you measure business value only in terms of features you have today, then you can end up deep in technical debt; you can add features today in such a way that features tomorrow cost more and more. The value of refactoring is wholly contained in the future ability of programmers to comprehend and modify the code. It's called maintainability, but that's a boring word, so let's call it Agility. Mmmm, sexy. Well-factored code is agile code because it's better able to change.
In economic terms, refactoring is an investment, or the repayment of a debt. It's only worth doing over a time frame when the interest payments or repayments (in the form of ongoing productivity gains) compound to exceed the time invested. Fingers crossed the business or project sponsor is also planning over such time frames.
The term refactoring comes from mathematics. You may remember your high school algebra:
2x2 + 10x
Stay with me! No glazing over! If you refactor the expression, extracting the common factor, 2x, you get:
2x . (x+5)
Sometimes it's hard to spot the common factors, in both mathematics and programming, and it can certainly be done poorly. More on that later.
Like most powerful techniques in software development, the purpose of refactoring is controlling complexity.
Complexity is bad, mmmkay. Complexity is evil.
I was fortunate enough to be chatting about project complexity with the legendary Dave Thomas (OTI, Eclipse) at JAOO Sydney in May. He nailed it: "kLOC kills". Complexity and scale in codebases is a major contributor to schedule blowouts, poor velocity and excessive development cost. Complexity is a kitten killer from way back.
Fred Brooks discusses two categories of complexity. Essential complexity and accidental complexity.
Essential complexity is the complexity of the domain. In NASA software, there's no escaping rocket science. You can isolate and divide essential complexity but you can never remove it. Essential complexity belongs to the problem. By contrast, accidental complexity is an artefact of the systems, languages, frameworks you're using. In principle it can be reduced by changing the system. Accidental complexity belongs to the solution. Refactoring reduces accidental complexity.
If you don't have much experience with it and you're looking for some concrete tutorials on refactoring, I suggest you start with Martin Fowler's seminal book Refactoring. Fowler also maintains a catalog of refactoring recipes with an Object Oriented flavour.
One of the most basic techniques is Extract Method which all decent IDEs can do automatically. You know you need Extract Method if you have a multi-page method with a sequence of of comment blocks which look like this: Now that we have the InductionActuator, look up the FluxCapacitor.... Doing it manually means snipping out a logical sequence of code and pasting it into a new, small, well-named method, stitching the local variables used from the originating context into parameters to the method. If this is hard due to sloppy scoping or too many variables, you may consider Introducing a Field from a local variable.
The inner loop of agile development should go like this: Red, Green, Refactor. Red means you have a test which is not passing. Getting the test to pass is the next step. Green means you are passing all tests. Refactor means... refactor.
Even if you're a good agile developer, doing things as simply as possible, complexity and duplication of common factors creeps in while you're trying to pass tests. Everybody hacks. Everyone copies and pastes. This is fine as long as you go back and refactor when you've got the green bar. Sometimes you may need to avoid mentioning this to PHBs, for their own good. Shhh!
Unit tests are really important for refactoring. If you're not doing unit testing you've got a long way to go. A good unit test suite is a necessary precondition for confident, aggressive refactoring. And IMHO a good type system is a necessary precondition for confident, aggressive, automated refactoring. These preconditions can present a quandary for some developers. Legacy systems often have no effective automated tests. And since they're often composed entirely of spaghetti, they need to be refactored. It's a chicken and egg situation, where do you start? All I can say here is you start small.
Refactoritis
Can you have too much refactoring? Absolutely. If you're somewhere around middle-stage zealotry for this refactoring stuff, you may not be in danger of copy+pasting your way to a big ball of mud, but you may fall prone to exceed the safe working abstraction load of your language or go too far beyond the idioms of your team's codebase or comfort.
Every language has limits imposed by its design and implementation. In Java and C#, for example, the limits are seen by many in the dynamic languages camps to be too much to bear. For example, say you're refactoring some Java or C# code. You might create a new interface with a few alternate concrete implementations and, whereas before you had two methods on a concrete class and a few big if-else blocks, after refactoring you might have three files and more actual lines of source code. It can be somewhat subjective but sometimes you may have more complexity even though you've removed duplication!
If this happens you have fallen asleep on the refactoring train and missed your station. Often you should just roll back the code and go write a feature. Some duplication is easy to see and cope with, especially if it can fit on one screen and any reader can see the pattern. In other languages, Lisp comes to mind, there are constructs (like macros) which allow you to encapsulate expressions that cannot be elegantly factored in, say, Java. Disclaimer: IANALN; I Am Not A Lisp Nerd.
So the expressiveness of the language can constrain refactorability. Another way of saying this is that the language contains accidental complexity and only factoring out the language can remove that complexity. I should say here that I have recently found Groovy to be a great candidate for doing this on Java projects.
As a more concrete example, lexical closures are a great way of implementing things like the new for loop (for each) introduced in Java 1.5 and functor frameworks that employ anonymous inner classes in Java for similar purposes (e.g. composable transformers, ad hoc iterator delegators instead of explicit looping) often feel too cumbersome compared to most closure implementations. So you just have to suffer the duplication and code bulk.
So in summary, Red, Green, Refactor, don't go overboard and be aware when your language makes capturing factors you see in your system worse. Kill complexity before it kills you.
If you're interested I'll be telling war stories and going into some side issues over on my personal blog.



Copyright © 2009 Atlassian Pty Ltd.

5 Comment(s)
Is there any ways to plan and reduce the complexity or atleast break the
problem at hand into smaller chunks of work/task and do it step by step
so as to have some control and reduce overall impact of the complex Refactor work if given to complete in some strict timeline to match?
Any thought on this please??
By guddu at October 10, 2009 9:42 AM
It's a very open-ended question!
Automated test coverage is essential for refactoring. Making sure that you write tests that are not "brittle" which means the implementation of production code can be changed with minimal disruption to the tests.
Remember the business impact of refactoring should be to increase your long-term average speed. Refactoring is not likely to have a "strict timeline" because the business never asks for it. They ask for features. Developers should fit in refactoring as a part of every day work. Some "big refactorings" may require larger time windows. That's a harder question and it depends on your project situation.
Apart from that, baby steps is the way.
By Chris Mountford
at
October 11, 2009 11:11 PM
Refactoring can be a huge waste of resources. A great deal of it is waste due to a lack of knowledge or experience, especially when code is quickly slapped together without consulting the team on its design. Perhaps someone on the team has some valuable experience to design it better in the first place. For example, I (a lowly bachelors of computer science) was being lectured on the high productivity possible with an advanced degree by such an obviously bright individual (he was good). I asked him to show me an example of what he meant, how he was able to achieve this much greater productivity. He was working on a device driver, that he had no business working on since he did not understand the subtle basics about it! He was going to create a massive set of unnecessary software (but do it superficially quickly using agile techniques). So I let him do a bit of show and tell and then shared my insights into the subject matter and let him benefit from insight that comes from subject matter expertise. After the shock wore off, he abandoned the effort, finding that my already existing solution was more efficiently designed and coded. Understand the domain first, design, code and THEN refactor if necessary. But avoid refactoring due to poor understanding. Some folks are regressing to cowboy programming and skipping the "time-consuming" research and development phase on seemingly easy yet deceptively complex issues.
By Nelson Perez at November 2, 2009 3:18 AM
Hi Nelson,
I don't mind publishing your comment, but it doesn't seem to have anything to do with refactoring. In future try to make your comments relevant to the topic.
By Chris Mountford
at
November 2, 2009 2:58 PM
The point was that this individual took great pride in the cut/paste code production method of producing massive amounts of code versus producing more streamlined code of a higher level of quality. I was just echoing the sentiment of "Refactoritis" above with a real-world example as one supposed super hero programmer ended up with a "big ball of mud" instead of something much tighter in a much tighter frame of time, which he could have achieved by more careful thought and design to begin with. I counseled him to abandon the cut/paste method mid-way thru his mud ball. He ended up seeing that producing tens of thousands of lines of code and refactoring is not necessarily all that impressive when one can spend a few hours designing and coding a much tighter solution in a much shorter timeframe.
By Nelson at November 17, 2009 8:10 PM