When Node.js was first released, it was a revolution. It allowed developers to run JavaScript, the primary programming language of the browser, on the server-side. Over time it grew in popularity to become a go-to tool for building Web applications and APIs.
Node.js consists of a small and stable core runtime and a set of built-in modules providing basic building blocks such as access to the filesystem, TCP/IP networking, HTTP protocol, cryptographic algorithms, parsing command line parameters, and many others. The built-in modules are robust, well tested, and performant.
Unfortunately, they do not cover all the needs of Web application developers. Sometimes we need to use programs outside of Node.js to implement the functionality we are working on.
Extending Node.js Modules With CLIs
One of the built-in Node.js modules is
child_process
. It allows JavaScript code to run other programs and communicate with them using standard input/output (I/O) mechanisms provided by the underlying operating system. Such programs can often be started interactively by the users using the command-line interface (CLI for short). This is why we sometimes refer to such programs as CLIs.Modern operating systems like Linux have many utility programs that implement functionality not included in the set of modules built-in Node.js. Using the
child_process
module to start an external program to implement specific functionality is a simple and effective way of extending the Node.js standard library.Retrieving git history
Let's take a look at how we might implement a simple Node.js program that prints the history of changes of a file from git. We will use the Express framework to handle the HTTP request and response handling for us. Node.js has no built-in support for running git commands, so we will have to run the git command ourselves to retrieve the data we need. To do that, we need to use the
exec
function from the child_process
module:const exec = require('child_process').exec; app.get('/history', (req, res) => { // Read file name from the URL query string. const file = req.query.file; // Prepare command to run const command = `git log --oneline ${file}`; // Execute the external program and send the response back exec(command, (err, output) => { // Respond with HTTP 500 if there was an error if (err) { res.status(500).send(err); return; } // If the program ran successfully, send the output in // the HTTP response res.send(output); }); });
We can now send an HTTP request to the program to read the file history using the
curl
program, another popular CLI:$ curl http://localhost:3000/history?file=app.js
The program will print out an abbreviated history of changes:
f38482b Call git log command 5444afb Initial commit
Looks like our short program is running fine. We needed just a few lines of code to implement a pretty complex operation.
Invoking Commands
Our program is simple and flexible. We can pass a path to any file on the server running our program, and if this file is under version control using git, we will get its history. Can we always trust the users to provide harmless input?
Untrusted data in parameters
Our applications, especially the Internet-facing ones, are exposed to users with a variety of motivations and different levels of technical skills. Some users may try to abuse the functionality of our application for their own purposes. Some will do it for fun and learning, some for fame, and some will do it for money.
It is a recommended programming practice to treat all input coming from users as untrusted unless proven otherwise. In Web applications, attackers may use all available components of the HTTP request to try to force our application to do something it was not designed to do. Possible attack vectors are the HTTP request body, the headers (including cookies), and query string parameters.
The effects of a successful abuse of such a modified input parameter may range from disclosing sensitive information, through denial of service, all the way up to complete server takeover. If you are suspecting that our application may allow a user with malicious intent to abuse the
file
parameter, you are absolutely right.Invoking malicious commands
Our program prepares the full command to be executed by using JavaScript template literal:
javascript
const command = `git log --oneline ${file}`;
If the provided
file
parameter value is app.js
, the executed command is:
bash
$ git log --oneline app.js
The attacker might provide a value that changes the structure of the executed command. Let's see what happens if the value of the
file
parameter is app.js; ls
:$ git log --oneline app.js;ls
Our program executes the git log command and then executes the
ls
command. The output sent to the user in the HTTP response is the combined output of both commands:f38482b Call git log command 5444afb Initial commit app.js bin node_modules package-lock.json package.json public routes views
This type of security vulnerability is called command injection. Can we cause even more damage using such an attack technique? Let's try to obtain the source code of our application:
$ curl http://localhost:3000/history\?file\=app.js\;cat%20app.js
The result contains not only the history of the
app.js
file but also its entire content as printed to the standard output by the cat app.js
command.We can use the same approach to exfiltrate the content of any file on the server that the application has access to, including configuration files. Attackers can also read the values of environment variables via the
env
command.This will be very useful for many types of malicious actors, but there is more they can do.
A Realistic Attack
Technically sophisticated attackers seek to gain control not only over the application but over the entire server. This allows them to gain persistent access to the compromised machine.
Minimalist shell
Many server operating systems have a utility program called
nc
(or netcat
). If the attacker has the ability to run this program using a command injection vulnerability, they can execute arbitrary commands on the compromised server. The simplest way to do it is to force the vulnerable application to run the following command:$ nc -l 6667 | /bin/bash
This command starts to listen for incoming connections on port 6667 (chosen by the attacker) and passes all the incoming data directly to the
bash
shell for execution. The port can bear. Let's see if this is possible to do with our vulnerability git history program:$ curl http://localhost:3000/history?file=app.js;nc%20-l%206667%20|/bin/bash
Now the attacker can send arbitrary commands to the compromised server:
$ echo 'killall node' | nc localhost 6667
In this particular case, the attacker also used the
nc
program to send the killall node
command to the bash
shell setup by the attacker. The effect is a denial of service attack that terminates all the Node.js processes on the server.Privilege escalation and lateral movement
Having the ability to run arbitrary commands on the server is a really attractive target for the attacker. In a typical attack scenario, compromising a server in this way is just the first step attackers take. The next step is to install malicious software that persists on the attacked machine and allow attackers to communicate with it over a longer period of time, ideally even after the compromised Node.js application has been restarted.
Ideally, our Node.js applications run with a minimal set of privileges. Command injection attacks allow attackers to run a reconnaissance of the infrastructure and steal administrative credentials or look for other vulnerabilities and misconfigurations that will allow them to escalate their privileges to allow attackers to further spread through the compromised network.
Having access to one compromised server allows the attackers to move to other hosts on the network, a process known as lateral movement. This allows the attacker to compromise more hosts on the network, gather more credentials, and look for other interesting targets and data to exfiltrate from our infrastructure.
If undetected, such attacks can go on for weeks or even months, leading to serious data breaches. How do we harden our Node.js applications from command injection vulnerabilities to prevent malicious actors from exploiting them?
Preventing Command Injection
There are several techniques that can prevent or at least greatly minimize the chance of command injection vulnerabilities sneaking into our code.
Do not execute arbitrary commands
The
exec
function from the child_process
module that we used passes the commands it received in the first parameter directly to the shell to execute. This is very flexible but leads to vulnerabilities we have just seen.A better option is to use the
execFile
function from the same module that starts a specific program and takes an array of arguments:**const execFile = require('child_process').execFile;** app.get('/history', (req, res) => { // Read file name from the URL query string. const file = req.query.file; // Prepare command to run const command = `/usr/local/bin/git`; const args = ["log", "--oneline", file]; // Execute the external program and send the response back execFile(command, args, (err, output) => { // Respond with HTTP 500 if there was an error if (err) { res.status(500).send(err); return; } // If the program ran successfully, send the output in // the HTTP response res.send(output); }); });
This way, our application is resilient to command injection attacks:
$ curl http://localhost:3000/history\?file\=app.js\;ls
Raises the expected error on the server-side:
{"killed":false,"code":128,"signal":null,"cmd":"/usr/local/bin/git log --oneline app.js;ls"}
What to do in cases where we need the flexibility afforded by the
exec
function? Are those cases a completely lost cause?Input validation or output sanitization?
Ideally both!
The root cause of command injection attacks is that when we manually assemble the command to be executed, some special characters (metacharacters) might change the structure of the command, and what we think of as data is going to be executed by the shell. The most commonly used metacharacters are:
& ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ \n \r
The best way to ensure such characters are not abused by attackers is to perform strict validation of untrusted input data. Input validation can verify things such as origin, size, or lexical structure of the data. Make sure you narrow down the set of acceptable values as much as possible.
In cases where input data may contain metacharacters, preventing command injection attacks requires proper escaping of those characters before passing them to the shell for execution. Doing so yourself is extremely difficult, and it is best to use a trusted and well-tested library. One such library is shell-quote, which can be installed using the npm package manager and used to format the entire command:
const quote = require('shell-quote').quote; // ... // Prepare command to run const command = quote(["git", "log", "--oneline", file]); // Execute the external program and send the response back exec(command, (err, output) => {
It looks like a simple fix, but shell metacharacter encoding is a difficult problem, and latent bugs can be found even in established libraries.
Summary
If your Node.js application is extending the functionality provided by the built-in set of modules by calling external programs, you are at risk of introducing command injection vulnerabilities. Such vulnerabilities allow attackers to abuse our applications and move laterally throughout our infrastructure.
If this sounds scary, you are right! This is a serious risk.
The good news is that we have several defensive coding techniques to help us build libraries and applications that are immune to command injection attacks. Using
execFile
helps prevent arbitrary shell commands from being executed and is the recommended defense. Input validation and proper encoding of shell metacharacters are also valuable protective measures.About Auth0
Auth0 by Okta takes a modern approach to customer identity and enables organizations to provide secure access to any application, for any user. Auth0 is a highly customizable platform that is as simple as development teams want, and as flexible as they need. Safeguarding billions of login transactions each month, Auth0 delivers convenience, privacy, and security so customers can focus on innovation. For more information, visit https://auth0.com.
About the author
Marcin Hoppe
Senior Manager, Product Security