Sign Up
Hero

Refactoring by Breaking Functions Apart: a TypeScript Experiment

Learn how to refactor a complex TypeScript function by breaking it into a composition of simpler ones.

In this article, I'd like to share some of the experiences I've learned while working in Clojure and try to backport some of the encouraged principles and practices to TypeScript.

Keeping Complexity Under Control

Clojure's author, Rich Hickey, has been giving a lot of wonderful presentations. Design, Composition, and Performance is still among my favorite ones, and in this article, we're going to apply some of the points he underlines in his talk:

  • Design means to break things apart in such a way that they can be put back together.
  • Whenever there is an architectural problem, it's likely because we haven't broken things apart enough.

These sound very generic and reasonable. Still, while in Clojure such practices are somehow enforced (at least to some extent), it's very much possible to derail from what good software should look like when using less opinionated languages.

Programming languages don't differ much in what they make possible, but in the kind of mistakes they make impossible. Mario Fusco

Sometimes derailing from the "good path" is necessary. More often, though, it's us losing the right track.

Complexity is something that is accumulated time after time. If not kept under control, it might ultimately end up with a project whose time per feature is measured in weeks rather than days, people getting upset about maintaining it, and then reaching the breakpoint where the cost of rewriting the entire system is less or equal than the cost of keeping the current one.

While this article is not going to be a solution to the problem, it will hopefully serve as a starting point to reason about your current codebase organization. If you find this interesting and you want to learn more, I highly recommend to watch the following talks by J. B. Rainsberger, where he does a great job analyzing the big picture problem of managing applications' complexity:

Analyzing an Entangled Function

To keep things practical, we're going to explore a function that is entangled up to the point that neither its testing nor its usage is enjoyable nor simple. Once we analyze and note the pain points, we will learn how we can make it better through an iterative refactoring.

Suppose we are writing a word text processor whose job is to give small hints about commas and conjunctions, with the following rules:

  1. If a sentence has a , there must be a conjunction after.
  2. If a sentence has a , but the following conjunction is and, we will hint the user to remove the comma.

For the point n.1, we're going to rely on https://dictionaryapi.dev, which offers a neat HTTP API that we can use to query for terms and get the speech part (article, conjunction, noun) from the response.

You can find the source code of this project on this GitHub repository. Feel free to clone down the repository and play with the code.

Note: The sample project requires NodeJS 12.x

Consider the following code from the src/index.ts file of the sample application:

//src/index.ts
import axios from 'axios';
import { promises as fs } from 'fs';

type ApiResult = Array<{ meanings: Array<{ partOfSpeech: string }> }>;

async function isConjunctionFn(term: string) {
  const result = await axios.get<ApiResult>(
    `https://api.dictionaryapi.dev/api/v2/entries/en/${encodeURIComponent(term)}`
  );

  const agg = result.data.flatMap(term => term.meanings.map(m => m.partOfSpeech));

  return agg.some(m => m === 'conjunction');
}

export async function processWord(
  word: string,
  isConjunction: (term: string) => Promise<boolean> = isConjunctionFn
) {
  if (word === 'and') {
    console.warn(`Consider removing the comma before '${word}'`);
    return true;
  } else {
    const isConjunctionTerm = await isConjunction(word);

    if (!isConjunctionTerm) {
      console.warn(`Consider adding a conjunction before '${word}'`);
    }
  }
}

async function program() {
  const data = await fs.readFile('./document.txt', 'utf-8');
  const words = data.split(' ');
  let acc = '';

  words.forEach(async word => {
    if (acc.endsWith(',')) processWord(word);

    acc = acc.concat(' ', word);
  }, '');
}

if (require.main === module) program();

This file has two main functions. The first is program, whose job is to read the content of the document.txt file from the disk, split its content using a space as a separator, and then iterate on all the words to call the processWord function.

Meanwhile, the processWord function will check whether the current word is a known conjunction. If so, it will emit a warning directly. Otherwise, it will request the data to the API by using the isConjunction function.

The Dictionary API returns an array of possible meanings:

{
  "word": "for",
  "meanings": [
    {
      "partOfSpeech": "preposition",
      "definitions": [
        {
          "definition": "In support of or in favor of (a person or policy)",
          "example": "they voted for independence in a referendum",
          "synonyms": ["on the side of","pro"
          ]
        }
      ]
    },
    {
      "partOfSpeech": "conjunction",
      "definitions": [
        {
          "definition": "Because; since.",
          "example": "he felt guilty, for he knew that he bore a share of responsibility for Fanny's death"
        }
      ]
    }
  ]
}

We're going to map over them and look if there's at least one whose partOfSpeech value equals conjunction. According to the result and the comma presence for the word we're currently evaluating — we'll emit a warning if necessary.

After installing its dependencies, let's run the program and see what's the output:

$ npm start

Consider removing the comma before 'and'
Consider adding a conjunction before 'over'

Let's try to write some tests for the program using jest:

// src/__tests__/index.test.ts
import { processWord } from '../index';

describe('#processWord', () => {
  it('returns true when feed with a conjunction', () => {
    expect(processWord('and', jest.fn().mockResolvedValue(true))).toBeTruthy();
  });

  it('returns false when feed with another conjunction', () => {
    expect(processWord('hello', jest.fn().mockResolvedValue(false))).toBeTruthy();
  });

  it('does not call the API when feed with and', () => {
    const conjSpy = jest.fn().mockResolvedValue(false);

    const result = processWord('and', conjSpy);
    
    expect(processWord('and', jest.fn().mockResolvedValue(false))).toBeTruthy();
    expect(conjSpy).not.toHaveBeenCalled();
    
    return expect(result).resolves.toBeTruthy();
  });
});

You can run those tests through the npm test command. So, the program works, and we also have successful tests.

There are some issues with such a code organization, though:

  1. The function processWord claims to take a word in input and do some processing with it. On the other hand, it is also incorporating a fallback mechanism (by making an API call) if necessary.
  2. The same function requires an additional argument, which is the function used to get the data when the processed word is a conjunction, but does not equal to and.

Additionally, because the things are so entangled together, our amount of testing has an upper limit: we are forced to use mocks and go with integration tests, which give less design feedback and avoid the pressure of reviewing the solution's design.

I've seen this weird pattern of passing functions as arguments around for some time now, claiming that this is just "dependency injection" and it will improve the system testability by passing mock functions when testing the code, as we just did in the snippet above.

I find this is not a valid argument. To me, this is more a tape-patchy way to fix something that has been put together incorrectly. That function's real name should be processWordButCallTheAPIIfNecessary.

Whenever there is a problem, it's likely because we haven't broken things apart enough.

With this principle in mind, let's explore some alternatives to "reassemble" this program in a different way.

Refactoring by Breaking Apart

The first thing we're going to be taking care of is to find a way to remove this "injected" function argument to the processWord function.

A first idea would be to pass the term definition downloaded from the Web API ahead of time, regardless of the condition. Something along these lines:

// src/index.ts

// ...existing code...

export async function processWord(
  word: string,
  isConjunction: boolean    //👈 changed code
) {
  if (word === 'and') {
    console.warn(`Consider removing the comma before '${word}'`);
    return true;
  } else {
    if (!isConjunction) {    //👈 changed code
      console.warn(`Consider adding a conjunction before '${word}'`);
    }
  }
}

async function program() {
  const data = await fs.readFile('./document.txt', 'utf-8');
  const words = data.split(' ');
  let acc = '';

  words.forEach(async word => {
    //👇 changed code
    const isConjunction = await isConjunctionFn(word);
    if (acc.endsWith(',')) processWord(word, isConjunction);
    //👆 changed code

    acc = acc.concat(' ', word);
  }, '');
}

// ...existing code...

However, this approach would be a waste of resources because it is only required when the current candidate word isn't and (which, by the way, is probably one of the most common words we can find in any text).

Sometimes sacrificing performances/efficiency for readability is a good call. However, for this use case, we're going to assume that fetching the term definition via the API is a very expensive operation that we really want to avoid.

Effectively speaking, the Dictionary API we're using has some rate-limits in place, so calling it unconditionally would not be a viable alternative anyway.

Let's break this function apart and reduce its scope to only work with a specific condition. If the word under examination is not and, we're going to make this fail by returning false, and the caller will have to sort this out. We'll also rename the function since now it's only doing one thing we can clearly identify.

Let's start by extracting the logic that checks if a word is and into a new function:

export function isAnd(word: string): boolean {
  return word.toLowerCase() === 'and';
}

We'll handle the situation where the processed word is not and somewhere else.

Additionally, we're also going to break the isConjunctionFn apart by separating the logic with the code that's doing the side effect (the API call). So, the current definition of isConjunctionFn can be broken into two functions as shown below:

// 😱 BEFORE ----------
export async function isConjunctionFn(term: string) {
  const result = await axios.get<ApiResult>(
    `https://api.dictionaryapi.dev/api/v2/entries/en/${encodeURIComponent(term)}`
  );
  const agg = result.data.map(term => term.meanings.map(m => m.partOfSpeech)).flat();

// 😀 AFTER ----------
export const fetchDictionaryTerm = (term: string) => axios.get<ApiResult>(
  `https://api.dictionaryapi.dev/api/v2/entries/en/${encodeURIComponent(term)}`
).then(r => r.data);

export const isConjunctionFn = (result: ApiResult) => {
  const agg = result.map(term => term.meanings.map(m => m.partOfSpeech)).flat();
  return agg.some(m => m === "conjunction");
 }

Let's now focus on the forEach part of the program function.

You can see that it's accumulating state with the cycles going on, and we can get rid of that by using reduce and dealing with the promises since the flow is async. That is a little bit unfortunate, but so is life. We'll deal with that for now.

So, you can consider the following code as the equivalent functionality of the loop in the program function:

const result = words.reduce<Promise<string>>(async (prev, word) => {
  const acc = await prev;

  if (acc.endsWith(',')) {
    if (!isAnd(word)) {
      console.warn('Consider removing the comma');
    } else if (isConjunctionFn(await fetchDictionaryTerm(word))) {
      console.warn('Consider adding a conjunction before ' + word);
    }
  }

  return acc.concat(' ', word);
}, Promise.resolve(''));

We're not done yet, but I think you're probably understanding where this is going.

This function is still entangling two things together: computing the suggestion AND the side effect of printing the data out. We can break up this function so that it will exclusively return suggestions instead:

// 😱 BEFORE ----------
words.reduce<Promise<string>>(async (prev, word) => {
   const acc = await prev;
  if (acc.endsWith(",")) {
     if (!isAnd(word)) {
      console.warn("Consider removing the comma");
     } else if (isConjunctionFn((await (fetchDictionaryTerm(word))))) {
      console.warn("Consider adding a conjunction before " + word);
     }
   }

  return acc.concat(" ", word);
}, Promise.resolve(""));

// 😀 AFTER ----------
const suggestions = await words.reduce<Promise<Suggestion[]>>(async (prev, word, index, array) => {
   const acc = await prev;
  const prevWord = array[index - 1];

  if (prevWord === ",") {
     if (!isAnd(word)) {
      return acc.concat([{ type: 'REMOVE', target: 'comma', parent: prevWord }]);
     } else if (isConjunctionFn((await (boundFetchDictionaryTerm(word))))) {
      return acc.concat([{ type: 'REPLACE', target: word, parent: prevWord }]);
     }
   }

  return acc;
}, Promise.resolve([]));

And then we can handle the printing on the screen separately — somewhere else

suggestions.forEach(s =>
  console.warn(`Consider ${s.type === 'REMOVE' ? 'removing the comma in' : 'adding a conjunction after'} '${s.parent}'`)
);

The next problem we have to tame — our reducer function is not strictly pure: in some conditions, we are hitting the network to fetch the required data. That is still another example of completing two things together: the decision process and the data retrieval procedure.

We'll break them apart, returning an intent to fetch in case it's required:

const reducer = (prev: Suggestion[], word: string, index: number, array: string[]) => {
  const acc = prev;
  const prevWord = array[index - 1];

  if (prevWord && prevWord.endsWith(',')) {
    if (isAnd(word)) {
      return acc.concat([{ type: 'REMOVE', parent: prevWord, target: word }]);
    } else {
      return acc.concat([{ type: 'FETCH', parent: prevWord, target: word }]);
    }
  }

  return acc;
};

With this change, now our reducer is pure, and, magically, we do not have to deal with Promises anymore. The async keyword has disappeared from the function. We will handle the data retrieval piece somewhere else.

It is time to review all the changes we've done so far:

export const fetchDictionaryTerm = (term: string) =>
  axios
    .get<ApiResult>(`https://api.dictionaryapi.dev/api/v2/entries/en/${encodeURIComponent(term)}`)
    .then(r => r.data);

export const isConjunctionFn = (result: ApiResult) => {
  const agg = result.map(term => term.meanings.map(m => m.partOfSpeech)).flat();

  return agg.some(m => m === 'conjunction');
};

export function isAnd(word: string): word is 'and' {
  return word === 'and';
}

export const reducer = (prev: Suggestion[], word: string, index: number, array: string[]) => {
  const acc = prev;
  const prevWord = array[index - 1];

  if (prevWord && prevWord.endsWith(',')) {
    if (isAnd(word)) {
      return acc.concat([{ type: 'REMOVE', parent: prevWord, target: word }]);
    } else {
      return acc.concat([{ type: 'FETCH', parent: prevWord, target: word }]);
    }
  }

  return acc;
};
  1. We have now successfully broken things apart, with all the pieces doing ONE thing
  2. We have pushed the side effects away in a single function.

Putting Back Together

We have our functional core. It's now time to put all the pieces back together.

If you noticed, throughout the article, I've mentioned multiple times the locution we'll handle it somewhere else: that is the imperative shell — which is the missing piece.

We'll move all these functions above in its own file. We'll call it logic.ts and its content will be as follows:

// src/logic.ts
export type ApiResult = Array<{ meanings: Array<{ partOfSpeech: string }> }>;

export type Suggestion = {
  type: 'REPLACE' | 'REMOVE' | 'FETCH';
  parent: string;
  target: string;
};

export const isConjunctionFn = (result: ApiResult) => {
  const agg = result.flatMap(term => term.meanings.map(m => m.partOfSpeech));

  return agg.some(m => m === 'conjunction');
};

export const isAnd = (word: string) => word.toLowerCase() === 'and';

export const reducer = (prev: Suggestion[], word: string, index: number, array: string[]) => {
  const acc = prev;
  const prevWord = array[index - 1];

  if (prevWord && prevWord.endsWith(',')) {
    if (isAnd(word)) {
      return acc.concat([{ type: 'REMOVE', parent: prevWord, target: word }]);
    } else {
      return acc.concat([{ type: 'FETCH', parent: prevWord, target: word }]);
    }
  }

  return acc;
};

export function evalTerm(suggestion: Suggestion, term: ApiResult): Suggestion {
  if (isConjunctionFn(term)) return { type: 'REPLACE', target: suggestion.target, parent: suggestion.parent };

  return suggestion;
}

Then we glue the program in the index.ts file:

// src/index.ts

import axios from 'axios';
import { promises as fs } from 'fs';
import { Suggestion, evalTerm, reducer, ApiResult } from './logic';

export const fetchDictionaryTerm = (term: string) =>
  axios
    .get<ApiResult>(`https://api.dictionaryapi.dev/api/v2/entries/en/${encodeURIComponent(term)}`)
    .then(r => r.data);

async function program() {
  const data = await fs.readFile('./document.txt', 'utf-8');
  const words = data.split(' ');

  async function evalRemoteTerm(suggestion: Suggestion): Promise<Suggestion> {
    const termResult = await fetchDictionaryTerm(suggestion.target);

    return evalTerm(suggestion, termResult);
  }

  const suggestions = await Promise.all<Suggestion>(
    words
      .reduce<Suggestion[]>(reducer, [])
      .map(suggestion => (suggestion.type === 'FETCH' ? evalRemoteTerm(suggestion) : suggestion))
  );

  suggestions.forEach(s =>
    console.warn(`Consider ${s.type === 'REMOVE' ? 'removing the comma in' : 'adding a conjunction after'} '${s.parent}'`)
  );
}

if (require.main === module) program();

Testing the logic of our program is now a completely different game, as the following code shows:

// src/__tests__/index.test.ts

import { isAnd, evalTerm, reducer } from '../logic';

describe('#isAnd', () => {
  it.each(['and', 'AND'])('returns true when feed with any AND casing', w => {
    expect(isAnd(w)).toBeTruthy();
  });
});

describe('#evalTerm', () => {
  describe('when the current term is a conjunction', () => {
    const r = evalTerm({ type: 'FETCH', target: 'and', parent: 'plane' }, [
      { meanings: [{ partOfSpeech: 'conjunction' }] },
    ]);
    it('should return a replace suggestion', () => {
      expect(r).toHaveProperty('type', 'REPLACE');
    });
  });

  describe('when the current term is not conjunction', () => {
    const r = evalTerm({ type: 'FETCH', target: 'hello', parent: 'plane' }, [{ meanings: [{ partOfSpeech: 'noun' }] }]);
    it('should not touch the current suggestion', () => {
      expect(r).toHaveProperty('type', 'FETCH');
    });
  });
});

describe('#reducer', () => {
  describe('when the current part does not end with comma', () => {
    const result = reducer([], 'complex', 1, ['is', 'complex', 'to']);

    it('should not suggest anything', () => {
      expect(result).toHaveLength(0);
    });
  });

  describe('when the current part does ends with comma', () => {
    describe('when the current word is AND', () => {
      const result = reducer([], 'and', 2, ['is', 'complex,', 'and']);

      it('should suggest a removal', () => {
        expect(result).toHaveLength(1);
        expect(result[0]).toHaveProperty('type', 'REMOVE');
      });
    });
    
    describe('when the current word is not AND', () => {
      const result = reducer([], 'because', 2, ['is', 'complex,', 'because']);
    
      it('should suggest a fetch', () => {
        expect(result).toHaveLength(1);
        expect(result[0]).toHaveProperty('type', 'FETCH');
      });
    });
  });
});

We have a lot more granularity in testing our system — we can quickly get feedback about the design of the solution we're building. Last but not least, running small tests, ideally unit tests, is always going to be faster than running an entire set of integration/e2e tests.

Note: The definition of unit/integration/e2e test slides continuously up to the point that everybody creates his own definition. For the purpose of this article, anything that has a possible side effect is not classified as a unit test.

Aside: Securing Node.js Applications with Auth0

Securing Node.js applications with Auth0 is easy and brings a lot of great features to the table. With Auth0, we only have to write a few lines of code to get solid identity management solution, single sign-on, support for social identity providers (like Facebook, GitHub, Twitter, etc.), and support for enterprise identity providers (like Active Directory, LDAP, SAML, custom, etc.).

In the following sections, we are going to learn how to use Auth0 to secure Node.js APIs written with Express.

Creating the Express API

Let's start by defining our Node.js API. With Express and Node.js, we can do this in two simple steps. The first one is to use NPM to install three dependencies: npm i express body-parser cors.

Note: If we are starting from scratch, we will have to initialize an NPM project first: npm init -y. This will make NPM create a new project in the current directory. As such, before running this command, we have to create a new directory for our new project and move into it.

The second one is to create a Node.js script with the following code (we can call it index.js):

// importing dependencies
const express = require('express');
const bodyParser = require('body-parser');
const cors = require('cors');

// configuring Express
const app = express();
app.use(bodyParser.json());
app.use(cors());

// defining contacts array
const contacts = [
  { name: 'Bruno Krebs', phone: '+555133334444' },
  { name: 'John Doe', phone: '+191843243223' },
];

// defining endpoints to manipulate the array of contacts
app.get('/contacts', (req, res) => res.send(contacts));
app.post('/contacts', (req, res) => {
  contacts.push(req.body);
  res.send();
});

// starting Express
app.listen(3000, () => console.log('Example app listening on port 3000!'));

The code above creates the Express application and adds two middleware to it: body-parser to parse JSON requests, and cors to signal that the app accepts requests from any origin. The app also registers two endpoints on Express to deal with POST and GET requests. Both endpoints use the contacts array as some sort of in-memory database.

Now, we can run and test our application by issuing node index in the project root and then by submitting requests to it. For example, with cURL, we can send a GET request by issuing curl localhost:3000/contacts. This command will output the items in the contacts array.

Registering the API at Auth0

After creating our application, we can focus on securing it. Let's start by registering an API on Auth0 to represent our app. To do this, let's head to the API section of our management dashboard (we can create a free account) if needed) and click on "Create API". On the dialog that appears, we can name our API as "Contacts API" (the name isn't really important) and identify it as https://contacts.blog-samples.com/ (we will use this value later).

Securing Express with Auth0

Now that we have registered the API in our Auth0 account, let's secure the Express API with Auth0. Let's start by installing three dependencies with NPM: npm i express-jwt jwks-rsa. Then, let's create a file called auth0.js and use these dependencies:

const jwt = require('express-jwt');
const jwksRsa = require('jwks-rsa');

module.exports = jwt({
  // Fetch the signing key based on the KID in the header and
  // the singing keys provided by the JWKS endpoint.
  secret: jwksRsa.expressJwtSecret({
    cache: true,
    rateLimit: true,
    jwksUri: `https://${process.env.AUTH0_DOMAIN}/.well-known/jwks.json`,
  }),

  // Validate the audience and the issuer.
  audience: process.env.AUTH0_AUDIENCE,
  issuer: `https://${process.env.AUTH0_DOMAIN}/`,
  algorithms: ['RS256'],
});

The goal of this script is to export an Express middleware that guarantees that requests have an access_token issued by a trust-worthy party, in this case Auth0. Note that this script expects to find two environment variables:

  • AUTH0_AUDIENCE: the identifier of our API (https://contacts.mycompany.com/)
  • AUTH0_DOMAIN: our domain at Auth0 (in my case bk-samples.auth0.com)

We will set these variable soons, but it is important to understand that the domain variable defines how the middleware finds the signing keys.

After creating this middleware, we can update our index.js file to import and use it:

// ... other require statements ...
const auth0 = require('./auth0');

// ... app definition and contacts array ...

// redefining both endpoints
app.get('/contacts', auth0(), (req, res) => res.send(contacts));
app.post('/contacts', auth0(), (req, res) => {
  contacts.push(req.body);
  res.send();
});

// ... app.listen ...

In this case, we have replaced the previous definition of our endpoints to use the new middleware that enforces requests to be sent with valid access tokens.

Running the application now is slightly different, as we need to set the environment variables:

export AUTH0_DOMAIN=blog-samples.auth0.com
export AUTH0_AUDIENCE="https://contacts.blog-samples.com/"
node index

After running the API, we can test it to see if it is properly secured. So, let's open a terminal and issue the following command:

curl localhost:3000/contacts

If we set up everything together, we will get a response from the server saying that "no authorization token was found".

Now, to be able to interact with our endpoints again, we will have to obtain an access token from Auth0. There are multiple ways to do this and the strategy that we will use depends on the type of the client application we are developing. For example, if we are developing a Single Page Application (SPA), we will use what is called the Implicit Grant. If we are developing a mobile application, we will use the Authorization Code Grant Flow with PKCE. There are other flows available at Auth0. However, for a simple test like this one, we can use our Auth0 dashboard to get one.

Therefore, we can head back to the APIs section in our Auth0 dashboard, click on the API we created before, and then click on the Test section of this API. There, we will find a button called Copy Token. Let's click on this button to copy an access token to our clipboard.

After copying this token, we can open a terminal and issue the following commands:

# create a variable with our token
ACCESS_TOKEN=<OUR_ACCESS_TOKEN>

# use this variable to fetch contacts
curl -H 'Authorization: Bearer '$ACCESS_TOKEN http://localhost:3000/contacts/

Note: We will have to replace <OUR_ACCESS_TOKEN> with the token we copied from our dashboard.

As we are now using our access token on the requests we are sending to our API, we will manage to get the list of contacts again.

That's how we secure our Node.js backend API. Easy, right?

Conclusion

This is just the beginning. Believe it or not, there's still more we could do here, but we're going to stop for now.

At the beginning of the article, I am confident that you wouldn't think there'd be so much refactoring to do for a function that's not doing that much. Imagine how much can be done on bigger systems.

My hope is that this little example will give you some tools that you can use to drive a conversation in your organization on moving towards the same direction.

So go, and break your functions apart. Then put them back together.

You can download the final refactored version of the initial project from the refactored branch of the GitHub repository.