How to write clean code units (functions and methods)

How to write clean code units (functions and methods)

Featured on Hashnode

Code units at the edges of your application should be small, independent and only do one thing.

This is a direct application of many established programming principles. This includes KISS, separation of concerns, single responsibility principle and many more.

So, to have "good code", apply this principle where possible.

This article will show you how to apply it. It will also examine why this principle is important and how it makes your code better.

To showcase, we'll use units at the edge of an application. They're easier to use to illustrate this point. But, after you learn the guidelines, you can apply them to any kind of code unit.

Overall, the main point of this article can be summarized in the following example. Example 1 has a large makeRequest function, which is worse than example 2. In example 2, that function has been separated into two smaller and more specific functions.

Example 1, the bad version:

function main() {
  const data = getData();
  makeRequest(data);
}

function makeRequest(data) {
  if (isValid(data)) {
    fetch('https://myfakeapi.com/', {
      method: 'POST', body: JSON.stringify(data)
    });
  } else {
    fetch('https://errormonitoringservice.com/', {
      method: 'POST', body: JSON.stringify(data)
    });
  }
}

Example 2, the good version:

function main() {
  const data = getData();
  if (isValid(data)) {
    makeRequest(data);
  } else {
    reportError(data);
  }
}

function makeRequest(data) {
  fetch('https://myfakeapi.com/', {method: 'POST', body: JSON.stringify(data)});
}
function reportError(data) {
  fetch('https://errormonitoringservice.com/', {method: 'POST', body: JSON.stringify(data)});
}

Let's examine why example 1 is worse.

Note: In this article, a unit refers to a function / method / module / class. We'll be using functions, but any of them can be used.

Small, independent units

An "edge" unit of code is a fairly small piece of functionality that doesn't have any dependencies. It does some fairly low-level stuff and it doesn't call any other functions to help it. It's at the extremities, the very edges, of your application.

It's safe code that you call to help you do something.

When you call it, you know what it's going to do and you know that it's not going to break anything.

It should be like a well-tested library that you've imported into your project. It does something small and specific and you expect it to work 100% of the time.

To do that, these kinds of units:

  • should be small
  • should only do one small, specific thing
  • should be independent
  • shouldn't have side effects, unless the only purpose of the unit is to perform a side effect

Examples of good code units

Here are some examples of these kinds of good units:

function add(a, b) {
  return a + b;
}

function getProperty(object, propertyName) {
  return object[propertyName];
}

function appendElementToBody(element) {
  document.body.append(element);
}

function doubleSpeed(gameObject) {
  gameObject.speed = gameObject.speed * 2;
}

function incrementSpeedDamaged(gameObject) {
  gameObject.speed = gameObject.speed + 0.5;
}
function incrementSpeed(gameObject) {
  gameObject.speed = gameObject.speed + 1;
}

Notice that these units:

  • don't have conditionals (if / else statements)
  • do very little
  • don't read / write to anything except their parameters (except for appendElementToBody, because the document object is a global singleton)
  • only have side effects if they do nothing else

In comparison, here are some units that don't follow these guidelines:

const valueToAdd = 5;
function add(x) {
  return valueToAdd + x;
}

const object = {/* has some properties here*/};
function getProperty(propertyName) {
  return object[propertyName]
}

function appendElementToBody(element) {
  if (element.id === 'foo') {
    return; // do nothing
  }
  document.body.append(element);
}

let shouldDouble = true;
function doubleSpeed(gameObject) {
  if (shouldDouble) {
    gameObject.speed *= 2;
  })
}

function incrementSpeed(gameObject, isDamaged) {
  if (isDamaged) {
    gameObject.speed += 0.5;
  } else {
    gameObject.speed += 1;
  }
}

We'll examine each of them in detail, including what makes them good or bad.

But first, let's examine the advantages and disadvantages of the guidelines in general. What are the benefits that you gain from the good code examples, rather than the bad ones?

Benefits of good code units

If you follow the guidelines, you get the benefits of good code. Things such as:

  • code that is easy to understand
  • code that works correctly, predictably, without unintended consequences
  • code that is easy to reuse
  • code that is easy to change
  • code that is easy to test

If you use the bad versions, you get the opposite. Things such as:

  • code that is harder to understand
  • code that is not predictable, can have unintended consequences, is harder to track and easier to get wrong
  • code that is not reusable
  • code that is brittle and hard to change
  • code that's much harder to test

Next, let's see how the examples given affect these benefits / disadvantages.

Examining examples of code units and their benefits

Let's go through each example one by one. Some will be more trivial and faster than others.

Example: add

The add function is trivially simple.

function add(a, b) {
  return a + b;
}

However, it showcases the point of good units well. This function is:

  • extremely simple to understand
  • reusable every time you need it
  • extremely easy to test

One thing you may be wondering is "so what"? Why should you have an add function when you can just add things inline when you need to?

Well, let's just say that there are many valid reasons to have one. For example, you may need to pass it into a higher order function like map, or to use partial application.

In addition, add just showcases the principle. Instead of add you might have some real functionality which works exactly like add internally. For example, you may have a function formUserGreeting(username, userFlair), which may concatenate (add) the username and userFlair together.

Here is the bad version of the add code:

const valueToAdd = 5;
function add(x) {
  return valueToAdd + x;
}

This version is much worse.

For starters, it has a weird signature which you might not expect. If you were working in some file foo and you imported this function to use it, you probably wouldn't remember or expect it to work the way that it does. It would confuse you for a moment until you examined the function closer.

This breaks the principle of least astonishment (one of the fundamental principles). When something works differently to how you expect, it's easy to create bugs.

This function is also more difficult to understand. You have to spend additional time to read the source code of this function before you understand how it works.

Also, it's not reusable. It always adds 5 to the number you provide. This means that you can never reuse it unless you want to add 5.

So overall, it's much worse.

To create the good version, make sure that the function it only accesses its local scope. It should receive everything it needs to work as an argument. It shouldn't access anything else.

Finally, it takes no effort to have the better version, so you might as well have it.

Example: getProperty

Next is the getProperty example.

Here is the code for the good version:

function getProperty(object, propertyName) {
  return object[propertyName];
}

Here is the code for the bad version:

const object = {/* has some properties here*/};
function getProperty(propertyName) {
  return object[propertyName]
}

The benefits / disadvantages are the same as the add example.

The good version is:

  • 100% predictable
  • easy to understand
  • easy to reuse
  • easy to test

The bad version has a signature that a developer may not expect until they look at the code. It's also not reusable if you want to work with a different object.

To get the good version, write the function in a way where it doesn't read anything outside of its local scope.

Example: appendElementToDom

Now we're starting to examine functions that may seem more realistic. These are functions that you probably have in your codebase (or something similar to them).

Here is the good version:

function appendElementToBody(element) {
  document.body.append(element);
}

Here is the bad version:

function appendElementToBody(element) {
  if (element.id === 'foo') {
    return; // do nothing
  }
  document.body.append(element);
}

The second version of the code is concerning. It has a conditional that's not obvious to a user of the function unless they look at its source code.

Consider, if you use a function named appendElementToBody, what would you expect it to do?

You would probably expect it to append an HTML element to the body element, 100% of the time, not just some of the time.

Also consider, when you import a library to use in a project, you expect it to do what it says on the tin. You don't expect it to have hidden conditions where it sometimes does what you expect, other times it doesn't do anything and other times it does something different altogether.

The problem with this code is the following scenario:

Tomorrow, you realise that have a bug in your program. It turns out that whenever a user creates a particular todo list item, it doesn't get added to the DOM. Maybe it doesn't get added to the database either (you may have a similar condition there).

In this situation, unless you specifically remember how the appendElementToBody works (read: you already know where the bug is), it will probably take you a few hours to find the bug.

Most likely, you're going to trace the code from the start, from where the user clicks "submit" for the new todo. The appendElementToBody is the last function that gets run, so you might not examine it for a long time.

Now, this example is very small and trivial. It's unlikely that you'll run into trouble for checking if an element has an ID of foo.

But it's not difficult to see how something like this can become a problem under different circumstances. You may have more complicated conditions. You may also have conditions in many functions all over your codebase.

Something, at some point, will cause a bug. In the meantime, there could already be bugs without anyone realising.

Anyway, that's enough of a rant. The point is, don't do this.

Possible improvements

Your unit functions should be 100% predictable and do one small thing. They shouldn't have conditionals in them. That's not their responsibility or where that conditional logic should be.

Most of all, they shouldn't have implicit (unexpected and non-obvious) conditions like this.

Explicit conditionals are at least predictable. Something like this would be better:

function appendElementToBody(element, excludedSelectors) {
  for (let i = 0; i < excludedSelectors.length; i++) {
    const selector = excludedSelectors[i];
    if (document.querySelector(selector)) {
      return; // exit the function and do nothing
    }
  }
  document.body.append(element);
}

A better option may be to change the name of the function so its functionality is obvious:

function maybeAppendElementToBody(element, excludedSelectors) {
  for (let i = 0; i < excludedSelectors.length; i++) {
    const selector = excludedSelectors[i];
    if (document.querySelector(selector)) {
      return; // exit the function and do nothing
    }
  }
  document.body.append(element);
}

In this version, the function acts predictably. It does nothing for particular selectors, but at least you expect that.

But, for the best improvements, consider:

  • rethinking your program design so you don't need the condition
  • putting the condition in a higher-level function. "Move the logic up", so to speak, to a more appropriate place.

For example, you could have something like this:

// Extremely simple TODO creator with very basic code

const todos = [];

function handleNewTodoSubmit(event) {
  event.preventDefault();

  // read the DOM to see what the user has typed as the TODO title
  const title = document.querySelector('#todo-input').value;

  // condition is checked here (albeit slightly altered to the original)
  if (!doesTodoTitleAlreadyExist(todos, title)) {
    const todo = createTodoObject(title);
    todos.push(todo);
    displayTodo(todo);
  }
}

function doesTodoTitleAlreadyExist(todos, title) {
  function hasTargetTitle(todo) {
    return todo.title === title;
  }
  return todos.some(hasTargetTitle); // returns true if any of a todo in the array has the same title
}

function createTodoObject(title) {
  return { title };
}

function displayTodo(todo) {
  const todoElement = createTodoElement(todo);
  appendElementToBody(todoElement);
}

function createTodoElement(todo) {
  const todoElement = document.createElement('div');
  todoElement.id = todo.title;
  todoElement.textContent = todo.title;
  return todoElement;
}

function appendElementToBody(element) {
  document.body.append(element);
}

const todoForm = document.querySelector('#todo-form')
todoForm.addEventListener('submit', handleNewTodoSubmit);

In this example code, every function, including appendElementToBody, does what you expect 100% of the time.

The validation of the todo was moved from appendElementToBody to handleNewTodoSubmit. This is a much more appropriate place for it.

The correct way to think about it is that the todo should not be created if it already exists. That's the domain of the handleNewTodoSubmit function, not of the appendElementToBody function.

In other words, the check is now at a place where you would expect it to be. This means that debugging will be easier if there is a problem, because you'll find the relevant code faster.

Example: doubleSpeed

Code for the good version of doubleSpeed:

function doubleSpeed(gameObject) {
  gameObject.speed = gameObject.speed * 2;
}

Code for the bad version of doubleSpeed:

let shouldDouble = true;
function doubleSpeed(gameObject) {
  if (shouldDouble) {
    const currentSpeed = gameObject.speed;
    gameObject.speed = currentSpeed * 2;
  })
}

This example is the same as the appendElementToBody example.

doubleSpeed should do what it says on the tin. It shouldn't have implicit conditions where it does what you expect sometimes and nothing at other times. That's unexpected and can only lead to trouble.

Instead, some code higher up should decide if it needs to call it in the first place. Then it can either call it or not call it.

The benefits of the good version of the code are that it's:

  • predictable, easy to track and less likely to have weird bugs that depend on weird state and time
  • easy to understand
  • reusable. You can reuse this function anywhere in the codebase. However, you can't reuse the bad version unless you need the exact same condition.
  • easy to test. The bad version is practically impossible to test (because your test file can't modify the variable shouldDouble, unless you do a lot of work to circumvent that).

Example: incrementSpeed

This example showcases why you should avoid having Boolean parameters.

Here is the good version of the code:

function incrementSpeedDamaged(gameObject) {
  gameObject.speed = gameObject.speed + 0.5;
}
function incrementSpeed(gameObject) {
  gameObject.speed = gameObject.speed + 1;
}

Here is the bad version of the code:

function incrementSpeed(gameObject, isDamaged) {
  if (isDamaged) {
    gameObject.speed += 1;
  } else {
    gameObject.speed += 0.5;
  }
}

Does the Boolean parameter matter?

Yes, it does. Not a tremendous amount in this example, but it's definitely worse.

One problem with Boolean parameters is that they multiply the number of code paths in the function. In other words, there is an if / else statement in there.

For example:

function (booleanParameter) {
  if (booleanParameter) {
    doSomething();
  } else {
    doSomethingElse();
  }
}

Every additional Boolean parameter can double the number of possible code paths.

For example, with two Boolean parameters, this is what the code may look like. Pay particular attention to the sendData function:

function sendData(data, isValid, isDataFormatted) {
  if (isDataValid) {
    if (!isDataFormatted) {
      data = formatData(data);
    }
    fetch('https://myfakeapi.com', {method: 'POST', body: JSON.stringify(data)})
  } else {
    if (!isDataFormatted) {
      data = formatInvalidData(data);
    }
    fetch('https://myfakeapi.com/errors', {method: 'POST', body: JSON.stringify(data)})
  }
}

function formatData(data) {
  return data.split('');
}

function formatInvalidData(data) {
  return 'Error: ' + data;
}

function main() {
  const data = '123'; // get data from somewhere
  const isDataValid = validateData(data);
  const isDataFormatted = false;
  sendData(data, isDataValid, isDataFormatted);
}

The sendData function is quite complicated. It's difficult to understand and read through it. It has nested conditionals, which make code harder to understand and work with.

It's also not reusable, unless you need those exact conditions and arguments elsewhere. In particular, if you need if more conditions tomorrow, you'll need to add even more code to sendData to handle them. This means that sendData may grow over time, and become even more complicated.

It's also difficult to test. You need tests covering each possible code path.

In short, it's difficult to work with and it can become even more complicated in the future.

The better version is to have simple unit functions, that only do one thing, without conditionals. For example:

function sendData(data) {
  fetch('https://myfakeapi.com', {method: 'POST', body: JSON.stringify(data)});
}
function reportDataError(data) {
  fetch('https://myfakeapi.com/errors', {method: 'POST', body: JSON.stringify(data)});
}
function formatData(data) {
  return data.split('');
}
function formatIvalidData(data) {
  return 'Error: ' + data;
}
function main() {
  const data = '123'; // get data from somewhere
  const isDataValid = validateData(data);
  if (isDataValid) {
    const formattedData = formatData(data);
    sendData(formattedData);
  } else {
    const formattedData = formatInvalidData(data);
    reportDataError(formattedData);
  }
}

Notice that the sendData function is now trivially simple.

You may be thinking "but those conditions just moved to the main function, isn't that the same thing?" That's a fair argument. However, this code still has some advantages. In this version:

  • the unit functions are simple and easy to understand
  • the unit functions are reusable all over the codebase. If you need to handle new conditions, you can handle them in a different high-level function like main and still reuse the small unit functions.
  • the unit functions are trivial to test
  • the program in general is easier to modify or extend if you need more functionality

A more important reason is how the good version of the code can grow tomorrow, versus the bad version of the code.

For example, if new conditions arise tomorrow, the good version of the code may end up like this:

// We've kept the unit functions like sendData, but they're omitted for brevity

// More simple functions for new use-cases
function validateDataADifferentWay(data) {}
function validateSpecialData(data) {}

function main1() {
  const data = '123'; // get data from somewhere
  const isDataValid = validateData(data);
  if (isDataValid) {
    const formattedData = formatData(data);
    sendData(formattedData);
  } else {
    const formattedData = formatInvalidData(data);
    reportDataError(formattedData);
  }
}

function main2() {
  const data = '123'; // get data from somewhere, it should always be valid
  const speciallyFormattedData = formatDataADifferentWay(data);
  sendData(speciallyFormattedData);
}

function main3() {
  const data = '123'; // get data from somewhere
  const isDataValid = validateSpecialData(data);
  if (isDataValid) {
    const formattedData = formatData(data);
  } else {
    const formattedData = formatInvalidData(data);
    reportDataError(formattedData);
  }
}

This is quite good.

The unit functions we had are still 100% the same. We handle the new conditions in the different main functions which aren't too complicated. For new, specific functionality, we've created the new unit functions validateSpecialData and formatDataADifferentWay. (We've omitted the implementations for brevity.)

However, the bad version of the code wouldn't fare so well. Every new condition would be handled in sendData. As a result, sendData would become much more complicated.

Consider this example where we add a Boolean parameter needsSpecialFormatting. It's a flag which says that we should format the data in a different way:

function sendData(data, isValid, isDataFormatted, needsSpecialFormatting) {
  if (isValid) {
    if (!isDataFormatted) {
      if (needsSpecialFormatting) {
        data = formatDataADifferentWay(data);
      } else {
        data = formatData(data);
      }
    }
    fetch('https://myfakeapi.com', {method: 'POST', body: JSON.stringify(data)})
  } else {
    if (!isDataFormatted) {
      if (needsSpecialFormatting) {
        formattedData = formatDataADifferentWay(data);
      } else {
        formattedData = formatInvalidData(data);
      }
    }
    fetch('https://myfakeapi.com/errors', {method: 'POST', body: JSON.stringify(data)})
  }
}

function main1() {
  const data = '123'; // get data from somewhere
  const isDataValid = validateData(data);
  const isDataFormatted = false;
  sendData(data, isDataValid, isDataFormatted, false);
}

function main2() {
  const data = '123'; // get data from somewhere, it will always be valid
  const speciallyFormattedData = formatDataADifferentWay(data);
  sendData(data, true, false, true);
}

function main3() {
  const data = '123'; // get data from somewhere
  const isDataValid = validateSpecialData(data);
  if (isDataValid) {
    sendData(data, true, false, false);
  } else {
    sendData(data, false, false, false);
  }
}

As you can see, with one more Boolean parameter, sendData is becoming much more complicated. Things would become even worse as more parameters are added.

On top of that, even the call for sendData(data, true, false, false) is difficult to look at. It's a mental exercise trying to match each Boolean to the parameter it represents. It's possible to improve this by making sendData accept an object instead, but it's still more effort than the simple version.

In addition, what sendData does may be unexpected at first glance by a programmer who's not familiar with the code. As mentioned earlier, a programmer would expect that function to send some data and call it a day, not to do anything else. After all, the function's name is sendData, not send_data_if_valid_otherwise_report_error_and_also_format_the_data_if_needed (used underscore case to make it easier to read).

Finally, this function is breaking many of the programming principles, because:

  • it does many things, which breaks separation of concerns / single responsibility principle
  • it's not simple, which breaks KISS
  • it has many conditions with logic coupled together, which makes it more error prone to change. This breaks the goal of programming principles themselves, which is that code should be easy to work with.
  • it's not reusable for different conditions unless you add even more logic. This breaks the open-closed principle.

So instead, prefer small unit functions that only do one thing. If you have to pass a Boolean to a function, consider splitting it into two functions instead. One will handle the true case and the other will handle the false case.

Linking back to programming principles

The main thing to keep in mind is that these guidelines are just applications of the core programming principles. That includes KISS, the principle of least astonishment, separation of concerns / single responsibility principle and handling side effects well.

All of those principles point towards functions that tend to be small, only do one thing, are reusable, easy to understand, easy to change and easy to test.

Additionally, someone who understands those principles well would naturally create code units like the ones described in this article.

So the point of this article isn't necessarily to be prescriptive about how to create small units. Instead, think of it as an example for how to apply those principles in this situation.

In other words, it's a specific use-case to help you become more familiar with these principles in general. That way, you can apply them everywhere, without having to learn how to handle an infinite number of individual use-cases like this one.

So, to write even better code, I recommend looking at the programming principles more closely. To do that, you can have a look at clean code and programming principles - the ultimate beginner's guide, which is a crash course on some fundamental programming principles.

Applying these guidelines to other code units

We examined functions at the edge of an application because those can afford to be simple. Other functions may be more complicated.

As shown in the examples, higher-level functions can have conditionals and they can be longer.

As nice as it would be to avoid conditionals altogether, that's just not possible.

Every real program needs to do different stuff under different circumstances. The very best case is to format your conditionals differently and to put them in a more appropriate place, so that they're easy to work with.

Also, it's not possible for all of your functions to truly only do one small thing. The only functions with that luxury tend to be the functions at the very edge of your application. For everything else, it's more likely that they'll do a few things, say, 3 things, at an appropriate level of abstraction, in a way that it can be described as one thing.

For example:

function handleFormSubmit(event) {
  event.preventDefault(); // necessary to handle form submission with JavaScript, rather than HTML
  const data = getDataFromForm();
  const formattedData = formatData(data);
  sendData(formattedData);
}

The handleFormSubmit function does 4 things. It has 4 lines of code after all. However, you can also think of it as doing one thing. "It handles the form submission", that's one thing. Both are correct, it depends on what level of abstraction you consider.

So, since you can't just avoid conditionals and since your functions can't just do one thing, what can you do? All you can do is apply programming principles. A.k.a. do anything you can to ensure that your code is correct and easy to change.

At any given time, consider if your code is:

  • easy to understand
  • easy to reuse
  • easy to change
  • easy to test

Be pragmatic

As always, remember to be pragmatic. In this article, we examined how to write and structure good code units at the edge of your application.

That's the ideal, but the ideal might not always be realistic. If you can write code units like this without much effort, then do it. But if that's not possible, well, don't postpone a critical feature by 1 month because you want to refactor every code unit in your codebase. That wouldn't make sense or be realistic.

Instead, do the best you can and be pragmatic. You probably need to balance good code (which makes future development easier) and releasing features in an appropriate timeframe for your work.

Final notes

That's it for this article.

I hope you found it useful and I hope that the concepts and the reasons for them made sense.

What are your opinions on how code units should be, particularly units at the "edge" of your application? Is there anything you disagree with? Anything that wasn't covered? If there is anything, please leave a comment below.

Otherwise, for more details on how to write clean code and apply programming principles, please check out clean code and programming principles - the ultimate beginner's guide.