A First Look at Google Native Client’s Inner Sandbox

Google has recently announced a new project to allow web applications to be programmed in x86. Yep, seriously.

Doing so securely is quite a challenge. x86 machine code has proved rather malware-friendly because it is really permissive and mixing up code and data complicates seriously static analysis. Since Google’s objective is to speed up web apps, they’re not going to use emulation to ensure that the code does not perform unauthorised calls, they rather chose to enforce some constraints on the code with a static verification algorithm. It makes sense, but doing so correctly is tricky.

The Inner Sandbox

The key component is called the inner sandbox, where code can’t execute sensitive operations such as system calls. To ensure statically that the code will stay in the inner sandbox, the following constraints are checked:

  • C1. Once loaded into the memory, the binary is not writable,
    enforced by OS-level protection mechanisms during execution.
  • C2. The binary is statically linked at a start address of zero,
    with the first byte of text at 64K.
  • C3. All indirect control transfers use a nacljmp pseudoinstruction
  • C4. The binary is padded up to the nearest page with at least
    one hlt instruction (0xf4).
  • C5. The binary contains no instructions or pseudo-instructions
    overlapping a 32-byte boundary.
  • C6. All valid instruction addresses are reachable by a fallthrough
    disassembly that starts at the load (base) address.
  • C7. All direct control transfers target valid instructions.

C1 means no self-modifying code, and relies on the segment permissions (as you can see in the screenshot, the .text section is not writable). segments1

To sum things up, the purpose of these constraints is to ensure the following properties:

  • everything in the code segment can be reliably disassembled (so that illegal instructions can be detected statically)
  • if you jump somewhere, it must be in something that has already been disassembled

These properties, if they are enforced correctly, ensure that you can’t execute arbitrary code. Now the problem is that enforcing these properties correctly is quite hard. For example, a typical static analysis problem is: does a function return to the caller or somewhere else ? It is hard to ensure that the return address has not been modified dynamically on the stack. To bypass this problem, NaCl does not allow the return instruction, but rather uses…

The nacljmp pseudo-instruction

It consists in a mask and a jump:

and %eax, 0xffffffe0
jmp *(%eax)

It puzzled me for some time. The purpose of this construct is to enable jumping to an unknown location, by ensuring that the target is a 0 mod 32 address. Constraint C5 ensures that every 0 mod 32 address is a valid target (= it is not in the middle of an instruction), and therefore it has already been disassembled. So we don’t know statically where we’re jumping, but we know there’s no illegal instruction there.

This indirect jumping stuff sounds juicy enough, and it didn’t last long until somebody found an exploit for it.

“The validator was checking for calls through
registers, as in this two-byte call:

 83 e2 e0                and    $0xffffffe0,%edx
 ff d2                   call   %edx

but erroneously permitting similar calls through memory using this
addressing mode:

 83 e2 e0                and    $0xffffffe0,%edx
 ff 12                   call   *(%edx)"

Ironically enough, the example given in the research paper used memory addressing but also stated “we disallow memory addressing modes on indirect jmp and call instructions“.

We can note the position of Google with this project: they are releasing a research paper and the source code of a very early, experimental project to share with the security and research communities, and the developers can also be reached on the discussion group. They encourage people to break the sandbox and credit the author of the first exploit. Ain’t it cool ?

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s