When Node.js was first released, it was a revolution. It allowed developers to run JavaScript, the primary programming language of the browser, on the server-side. Over time it grew in popularity to become a go-to tool for building Web applications and APIs.

Node.js consists of a small and stable core runtime and a set of built-in modules providing basic building blocks such as access to the filesystem, TCP/IP networking, HTTP protocol, cryptographic algorithms, parsing command line parameters, and many others. The built-in modules are robust, well tested, and performant.

Unfortunately, they do not cover all the needs of Web application developers. Sometimes we need to use programs outside of Node.js to implement the functionality we are working on.

Extending Node.js Modules With CLIs

One of the built-in Node.js modules is child_process. It allows JavaScript code to run other programs and communicate with them using standard input/output (I/O) mechanisms provided by the underlying operating system. Such programs can often be started interactively by the users using the command-line interface (CLI for short). This is why we sometimes refer to such programs as CLIs.

Modern operating systems like Linux have many utility programs that implement functionality not included in the set of modules built-in Node.js. Using the child_process module to start an external program to implement specific functionality is a simple and effective way of extending the Node.js standard library.

Retrieving git history

Let's take a look at how we might implement a simple Node.js program that prints the history of changes of a file from git. We will use the Express framework to handle the HTTP request and response handling for us. Node.js has no built-in support for running git commands, so we will have to run the git command ourselves to retrieve the data we need. To do that, we need to use the exec function from the child_process module:

const exec = require('child_process').exec;

app.get('/history', (req, res) => {
  // Read file name from the URL query string.
  const file = req.query.file;

  // Prepare command to run
  const command = `git log --oneline ${file}`;

  // Execute the external program and send the response back
  exec(command, (err, output) => {
    // Respond with HTTP 500 if there was an error
    if (err) {
      res.status(500).send(err);
      return;
    }

    // If the program ran successfully, send the output in
    // the HTTP response
    res.send(output);
  });
});

We can now send an HTTP request to the program to read the file history using the curl program, another popular CLI:

$ curl http://localhost:3000/history?file=app.js

The program will print out an abbreviated history of changes:

f38482b Call git log command
5444afb Initial commit

Looks like our short program is running fine. We needed just a few lines of code to implement a pretty complex operation.

Invoking Commands

Our program is simple and flexible. We can pass a path to any file on the server running our program, and if this file is under version control using git, we will get its history. Can we always trust the users to provide harmless input?

Untrusted data in parameters

Our applications, especially the Internet-facing ones, are exposed to users with a variety of motivations and different levels of technical skills. Some users may try to abuse the functionality of our application for their own purposes. Some will do it for fun and learning, some for fame, and some will do it for money.

It is a recommended programming practice to treat all input coming from users as untrusted unless proven otherwise. In Web applications, attackers may use all available components of the HTTP request to try to force our application to do something it was not designed to do. Possible attack vectors are the HTTP request body, the headers (including cookies), and query string parameters.

The effects of a successful abuse of such a modified input parameter may range from disclosing sensitive information, through denial of service, all the way up to complete server takeover. If you are suspecting that our application may allow a user with malicious intent to abuse the file parameter, you are absolutely right.

Invoking malicious commands

Our program prepares the full command to be executed by using JavaScript template literal:

const command = `git log --oneline ${file}`;

If the provided file parameter value is app.js, the executed command is:

$ git log --oneline app.js

The attacker might provide a value that changes the structure of the executed command. Let's see what happens if the value of the file parameter is app.js; ls:

$ git log --oneline app.js;ls

Our program executes the git log command and then executes the ls command. The output sent to the user in the HTTP response is the combined output of both commands:

f38482b Call git log command
5444afb Initial commit
app.js
bin
node_modules
package-lock.json
package.json
public
routes
views

This type of security vulnerability is called command injection. Can we cause even more damage using such an attack technique? Let's try to obtain the source code of our application:

$ curl http://localhost:3000/history\?file\=app.js\;cat%20app.js

The result contains not only the history of the app.js file but also its entire content as printed to the standard output by the cat app.js command.

We can use the same approach to exfiltrate the content of any file on the server that the application has access to, including configuration files. Attackers can also read the values of environment variables via the env command.

This will be very useful for many types of malicious actors, but there is more they can do.

A Realistic Attack

Technically sophisticated attackers seek to gain control not only over the application but over the entire server. This allows them to gain persistent access to the compromised machine.

Minimalist shell

Many server operating systems have a utility program called nc (or netcat). If the attacker has the ability to run this program using a command injection vulnerability, they can execute arbitrary commands on the compromised server. The simplest way to do it is to force the vulnerable application to run the following command:

$ nc -l 6667 | /bin/bash

This command starts to listen for incoming connections on port 6667 (chosen by the attacker) and passes all the incoming data directly to the bash shell for execution. The port can bear. Let's see if this is possible to do with our vulnerability git history program:

$ curl http://localhost:3000/history?file=app.js;nc%20-l%206667%20|/bin/bash

Now the attacker can send arbitrary commands to the compromised server:

$ echo 'killall node' | nc localhost 6667

In this particular case, the attacker also used the nc program to send the killall node command to the bash shell setup by the attacker. The effect is a denial of service attack that terminates all the Node.js processes on the server.

Privilege escalation and lateral movement

Having the ability to run arbitrary commands on the server is a really attractive target for the attacker. In a typical attack scenario, compromising a server in this way is just the first step attackers take. The next step is to install malicious software that persists on the attacked machine and allow attackers to communicate with it over a longer period of time, ideally even after the compromised Node.js application has been restarted.

Ideally, our Node.js applications run with a minimal set of privileges. Command injection attacks allow attackers to run a reconnaissance of the infrastructure and steal administrative credentials or look for other vulnerabilities and misconfigurations that will allow them to escalate their privileges to allow attackers to further spread through the compromised network.

Having access to one compromised server allows the attackers to move to other hosts on the network, a process known as lateral movement. This allows the attacker to compromise more hosts on the network, gather more credentials, and look for other interesting targets and data to exfiltrate from our infrastructure.

If undetected, such attacks can go on for weeks or even months, leading to serious data breaches. How do we harden our Node.js applications from command injection vulnerabilities to prevent malicious actors from exploiting them?

Preventing Command Injection

There are several techniques that can prevent or at least greatly minimize the chance of command injection vulnerabilities sneaking into our code.

Do not execute arbitrary commands

The exec function from the child_process module that we used passes the commands it received in the first parameter directly to the shell to execute. This is very flexible but leads to vulnerabilities we have just seen.

A better option is to use the execFile function from the same module that starts a specific program and takes an array of arguments:

**const execFile = require('child_process').execFile;**

app.get('/history', (req, res) => {
  // Read file name from the URL query string.
  const file = req.query.file;

  // Prepare command to run
  const command = `/usr/local/bin/git`;
  const args = ["log", "--oneline", file];

  // Execute the external program and send the response back
  execFile(command, args, (err, output) => {
    // Respond with HTTP 500 if there was an error
    if (err) {
      res.status(500).send(err);
      return;
    }

    // If the program ran successfully, send the output in
    // the HTTP response
    res.send(output);
  });
});

This way, our application is resilient to command injection attacks:

$ curl http://localhost:3000/history\?file\=app.js\;ls

Raises the expected error on the server-side:

{"killed":false,"code":128,"signal":null,"cmd":"/usr/local/bin/git log --oneline app.js;ls"}

What to do in cases where we need the flexibility afforded by the exec function? Are those cases a completely lost cause?

Input validation or output sanitization?

Ideally both!

The root cause of command injection attacks is that when we manually assemble the command to be executed, some special characters (metacharacters) might change the structure of the command, and what we think of as data is going to be executed by the shell. The most commonly used metacharacters are:

& ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ \n \r

The best way to ensure such characters are not abused by attackers is to perform strict validation of untrusted input data. Input validation can verify things such as origin, size, or lexical structure of the data. Make sure you narrow down the set of acceptable values as much as possible.

In cases where input data may contain metacharacters, preventing command injection attacks requires proper escaping of those characters before passing them to the shell for execution. Doing so yourself is extremely difficult, and it is best to use a trusted and well-tested library. One such library is shell-quote, which can be installed using the npm package manager and used to format the entire command:

const quote = require('shell-quote').quote;

// ...

// Prepare command to run
const command = quote(["git", "log", "--oneline", file]);

// Execute the external program and send the response back
exec(command, (err, output) => {

It looks like a simple fix, but shell metacharacter encoding is a difficult problem, and latent bugs can be found even in established libraries.

Summary

If your Node.js application is extending the functionality provided by the built-in set of modules by calling external programs, you are at risk of introducing command injection vulnerabilities. Such vulnerabilities allow attackers to abuse our applications and move laterally throughout our infrastructure.

If this sounds scary, you are right! This is a serious risk.

The good news is that we have several defensive coding techniques to help us build libraries and applications that are immune to command injection attacks. Using execFile helps prevent arbitrary shell commands from being executed and is the recommended defense. Input validation and proper encoding of shell metacharacters are also valuable protective measures.

About Auth0

Auth0 provides a platform to authenticate, authorize, and secure access for applications, devices, and users. Security and application teams rely on Auth0's simplicity, extensibility, and expertise to make identity work for everyone. Safeguarding more than 4.5 billion login transactions each month, Auth0 secures identities so innovators can innovate, and empowers global enterprises to deliver trusted, superior digital experiences to their customers around the world.

For more information, visit https://auth0.com or follow @auth0 on Twitter.