Many software developers want to know techniques that they can immediately apply to make their code readable. A function easily gets bigger when the application grows. We often tend to think “this small modification doesn’t make the function unreadable” at the time when we add a new feature or change the behavior of the function. However, a reviewer may leave a comment on the function where we modify or the author may find its code unreadable 2 months later. I will show 5 techniques that you can immediately apply to your code.
Extract the logic into a function
The proper name explains what it does. Some developers add comments to explain what the logic does without refactoring. Let’s say it’s a bad habit if they often write such a comment. Comments should be written to explain why we choose the way to solve the problem or concerns. In many cases, comments are not necessary if we give a proper name to the logic. It’s time to refactor the code if we want to write a comment on the logic.
Don’t you often see the following code to check something for example?
function startCooking(){
...
if( !existEgg ||
!existBacon ||
!existFryingPan
) {
goShopping();
}
...
}
Multiple conditions are often in if-clause. When there are multiple conditions it’s less readable than one condition. This example is easy to understand but sometimes those conditions are not clear what they check when a numerical value or bitwise data is used. If we extract the conditions into a function it will be more readable. Refactored code is the following.
function startCooking(){
...
if(!isReadyToCookBaconEgg()) {
goShopping();
}
...
function isReadyToCookBaconEgg(){
return !existEgg ||
!existBacon ||
!existFryingPan;
}
}
It’s more readable now. The extracted logic is still in the same function because it is called only in startCooking
function. You can of course move it outside if you want. It’s a personal preference. The intention of the main function is clear and we don’t have to know which items to check for cooking bacon egg. We should only look into the details if something goes wrong. Its details should be hidden until then.
The higher hierarchy the function is, the more abstract the contents should be.
In this case above, foodstuff and kitchenware are the details. We may add oil and black pepper to the function isReadyToCookBaconEgg
in the future but the contents of startCooking
doesn’t change.
Extract function is I think the easiest technique to refactor code. Actually, it’s the most often used technique in my work.
Extract functions into a class
Those who just learned the OOP principle tend to create a class that has only one public function and many private functions called in the public function. Afterward, the class gets bigger and becomes a god class that does everything. Let’s refactor the code before promoting the class to a god.
Let’s assume that the Communicator class needs to create 3 different data. Each format of them is different each other. I think you’ve experienced a similar situation in your work. For example, changing format or adding additional information to adapt the data to a certain protocol. This Communicator class does it. This is the source code for it.
export class Communicator {
public execute() {
const originalData = this.fetchData();
const data1 = this.createSendData1(originalData);
const sendData1 = this.adaptData1(data1);
const data2 = this.createSendData2(originalData);
const sendData2 = this.adaptData1(data2);
const data3 = this.createSendData3(originalData);
const sendData3 = this.adaptData1(data3);
this.sendData([
sendData1,
sendData2,
sendData3,
]);
}
private createSendData1(data: Data): string {
...
}
private adaptData1(data: string): string {
...
}
...
}
This class generates 3 different data and the logic is written in the same class. It means that this class has many chances to be modified. execute
function gets longer when we add a new format and the function gets more complicated. This class shouldn’t know the detail of each function, so let’s refactor it. We can respectively treat these 3 createSendData
and adaptData
functions as one concern if we raise the abstract level of them because it has only one meaning. How can we treat it as one concern? The answer is to make it abstract. We can use either interface or abstract class for that but let’s use interface here. The reason to choose interface is the following.
- there is no common variable used by all derived classes
- there is no common process used by all derived classes
I suggest that you always use interface first. Change it to abstract class if you find code duplication or if you need a common variable for all derived classes. Let’s see the refactoring result.
interface DataCreator {
createData(data: Data): string;
adaptData(data: string): string;
}
class Communicator {
private dataCreators: DataCreator[] = [
new DataCreator1(),
new DataCreator2(),
new DataCreator3(),
];
public execute() {
const originalData = this.fetchData();
const data = this.dataCreators.map(x => {
const createdData = x.createData(originalData);
return x.adaptData(createdData);
});
this.sendData(data);
}
private fetchData(): string {
return "test-data";
}
private sendData(data: string[]) {
console.log(data);
}
}
class DataCreator1 implements DataCreator {
public createData(data: Data) {
return `creator1-${data}`;
}
public adaptData(data: string) {
return `***${data}***`;
}
}
...
Those 3 functions are basically the same. All of them create data, so I defined interface in order to call all functions on the same code. We don’t have to touch the function when we need to add a new format in the future. What we need to do in this case is to add a new class that implements the interface and add the instance to dataCreators . The role of the 3 classes is clear. The communicator class has still some roles that are not only knowing how to communicate but also knowing how to fetch and send data. Its refactoring is the next step but it’s not for this article, so I skip it.
Replace loop with pipeline
Many languages support pipelines. In my opinion, for-loop and while-loop shouldn’t be the first choice because they are less meaningful than a pipeline. Let’s assume that we want to calculate the average score.
const scores = [88, 31, 95, 55, 63];
let sum = 0;
for (let i = 0; i < scores.length; i++) {
sum += scores[i];
}
const average = sum / scores.length;
console.log(`for: ${average}`);
This is a very simple for-loop so it may be clear what it does but I don’t like it. If its process in the loop gets bigger we need to check everything and it’s harder to follow. We can write everything in it. The bad point of this loop is there is room that we can inject different concerns into this loop because the for-loop doesn’t have a certain meaning. It just does loop. If I implement the same functionality I would write in the following way.
const scores = [88, 31, 95, 55, 63];
const sum = scores.reduce((acc: number, cur: number) => acc + cur);
const average = sum / scores.length;
console.log(`pipeline: ${average}`);
reduce
has a clear meaning. It returns a single output value by executing a callback function on each element. There is no room that another concern is in it.
Less assumption leads to less read time.
Check the following article if you think “when can we use such a loop?”
Difference between for, while, forEach, for-in and for-of
Split pipeline to give a proper name
As mentioned above, a pipeline has better than a loop but it sometimes makes code unreadable when it has long chains. It may be data.filter().map().filter().reduce()… . If its calculation in each callback is common to all people it may not be necessary to split it but I recommend splitting its pipeline to make the meaning clear. The following code is to calculate the square sum by using a pipeline.
const array = [1, 2, 3, 4, 5, 6, 7, 8, 9];
const result = array
.filter((x) => x % 2 === 0)
.map((x) => x * x)
.reduce((acc, cur) => acc + cur);
console.log(`square sum: ${result}`);
In this simple example, it’s clear what each line does because we know the concept of even numbers and squares. However, the actual problem that we have to solve in our development is harder than this simple concept. It may be simple but new colleagues may not understand what it is. Even worse, we can’t find a bug here even if one of the callbacks is wrong because its intention is unclear without comment. My solution for this is the following.
const array = [1, 2, 3, 4, 5, 6, 7, 8, 9];
const evenNumber = array.filter((x) => x % 2 === 0);
const evenSquares = evenNumber.map((x) => x * x);
const sum = evenSquares.reduce((acc, cur) => acc + cur);
console.log(`square sum: ${sum}`);
We don’t have to know the detail of each line because the variable names indicate what they are. A follower can understand the logic soon and find a point where he wants to modify it. If the logic in one of the callbacks is wrong we can easily detect the error because it’s clear what the callback wants to return.
Split loop to focus on one thing
The key factor for clean code is to focus on one thing, only one thing at a time. We really often see a loop clause. There are two types of developers. One always considers performance. One considers readability. I think the difference is easily reflected to a loop. Which one do you prefer in the following code?
function calc1() {
let sum = 0;
let sumOfSquare = 0;
for(let i = 1; i <= 10; i++){
sum += i;
sumOfSquare += i * i;
}
console.log(`sum: ${sum}`);
console.log(`sum of square: ${sumOfSquare}`);
}
function calc2() {
let sum = 0;
for(let i = 1; i <= 10; i++){
sum += i;
}
console.log(`sum: ${sum}`);
let sumOfSquare = 0;
for(let i = 1; i <= 10; i++){
sumOfSquare += i * i;
}
console.log(`sum of square: ${sumOfSquare}`);
}
I guess many people say calc1 is better because it’s smaller in this case. calc2 has the same loop twice and its performance is worse than calc1. I agree in this case. How about the following case?
function calc1() {
let sumOfEven = 0;
let sumOfOdd = 0;
let oddSumInMultipleOfThree = 0;
for (let i = 1; i <= 10; i++) {
if (i % 2 === 0) {
sumOfEven += i;
} else {
sumOfOdd += i;
if (i % 3 === 0) {
oddSumInMultipleOfThree += i;
}
}
}
console.log(`even sum: ${sumOfEven}`);
console.log(`odd sum : ${sumOfOdd}`);
console.log(`odd sum in multiple of three : ${oddSumInMultipleOfThree}`);
}
function calc2() {
let sumOfEven = 0;
for (let i = 1; i <= 10; i++) {
if (i % 2 === 0) {
sumOfEven += i;
}
}
console.log(`even sum: ${sumOfEven}`);
let sumOfOdd = 0;
for (let i = 1; i <= 10; i++) {
if (i % 2 !== 0) {
sumOfOdd += i;
}
}
console.log(`odd sum : ${sumOfOdd}`);
let oddSumInMultipleOfThree = 0;
for (let i = 1; i <= 10; i++) {
if (i % 2 !== 0 && i % 3 === 0) {
oddSumInMultipleOfThree += i;
}
}
console.log(`odd sum in multiple of three : ${oddSumInMultipleOfThree}`);
}
I can imagine that the calc1 will get complicated in the future because all calculations are done in a loop. It may overwhelm at a first look even though this is very simple logic. When we need to modify or add new functionality we have to understand everything at the same time. On the other hand, calc2 splits the logic into multiple loops and we can easily understand what they do because one loop has only one concern. It’s important for writing clean code not to use a brain’s working memory because its size is limited. Working memory size depends on person but it is 5 – 7 as far as I know. Our performance increases when there are fewer things to remember at the same time.
When we want to add a new feature we just need to add a new loop without considering the existing code carefully. Some want to say that the performance is worse than one loop! Yes, I know but how much of a performance difference does it make? It doesn’t cause performance issues in most cases. If we find the function slow we can easily detect and improve it if the code is well split because what we need to do is just move the logic into the original loop. It’s better to write a comment in this case to explain why all logic in the loop.
The refactoring above causes code duplication. If you can’t tolerate it you can create a callback function. The example is the following.
function calc3() {
const startLoop = (cb: (i: number) => void) => {
for (let i = 1; i <= 10; i++) {
cb(i);
}
}
let sumOfEven = 0;
startLoop((i) => {
if (i % 2 === 0) {
sumOfEven += i;
}
});
console.log(`even sum: ${sumOfEven}`);
let sumOfOdd = 0;
startLoop((i) => {
if (i % 2 !== 0) {
sumOfOdd += i;
}
});
console.log(`odd sum : ${sumOfOdd}`);
let oddSumInMultipleOfThree = 0;
startLoop((i) => {
if (i % 2 !== 0 && i % 3 === 0) {
oddSumInMultipleOfThree += i;
}
});
console.log(`odd sum in multiple of three : ${oddSumInMultipleOfThree}`);
}
End
We can make our code readable if we consider small points. The software consists of tons of small things. The source code that we write will be read by other people and they may interpret the code in a different way and add new code into improper statements. It’s important to prevent such a thing by using an interface, abstract class, and a function whose purpose or meaning is clear. Let’s start small refactoring on daily basis. If you are looking for a nice book for refactoring I highly recommend the following books.
If you want to go further I recommend this book. It shows a lot of architectural things that we can apply to make our code or overall structure clear. Many topics written in this book are short but they can be a good starting point.
Comments