Google has recently announced a new project to allow web applications to be programmed in x86. Yep, seriously.
Doing so securely is quite a challenge. x86 machine code has proved rather malware-friendly because it is really permissive and mixing up code and data complicates seriously static analysis. Since Google’s objective is to speed up web apps, they’re not going to use emulation to ensure that the code does not perform unauthorised calls, they rather chose to enforce some constraints on the code with a static verification algorithm. It makes sense, but doing so correctly is tricky.
The Inner Sandbox
The key component is called the inner sandbox, where code can’t execute sensitive operations such as system calls. To ensure statically that the code will stay in the inner sandbox, the following constraints are checked:
- C1. Once loaded into the memory, the binary is not writable,
enforced by OS-level protection mechanisms during execution. - C2. The binary is statically linked at a start address of zero,
with the first byte of text at 64K. - C3. All indirect control transfers use a nacljmp pseudoinstruction
- C4. The binary is padded up to the nearest page with at least
one hlt instruction (0xf4). - C5. The binary contains no instructions or pseudo-instructions
overlapping a 32-byte boundary. - C6. All valid instruction addresses are reachable by a fallthrough
disassembly that starts at the load (base) address. - C7. All direct control transfers target valid instructions.
C1 means no self-modifying code, and relies on the segment permissions (as you can see in the screenshot, the .text section is not writable).
To sum things up, the purpose of these constraints is to ensure the following properties:
- everything in the code segment can be reliably disassembled (so that illegal instructions can be detected statically)
- if you jump somewhere, it must be in something that has already been disassembled
These properties, if they are enforced correctly, ensure that you can’t execute arbitrary code. Now the problem is that enforcing these properties correctly is quite hard. For example, a typical static analysis problem is: does a function return to the caller or somewhere else ? It is hard to ensure that the return address has not been modified dynamically on the stack. To bypass this problem, NaCl does not allow the return instruction, but rather uses…
The nacljmp pseudo-instruction
It consists in a mask and a jump:
and %eax, 0xffffffe0 jmp *(%eax)
It puzzled me for some time. The purpose of this construct is to enable jumping to an unknown location, by ensuring that the target is a 0 mod 32 address. Constraint C5 ensures that every 0 mod 32 address is a valid target (= it is not in the middle of an instruction), and therefore it has already been disassembled. So we don’t know statically where we’re jumping, but we know there’s no illegal instruction there.
This indirect jumping stuff sounds juicy enough, and it didn’t last long until somebody found an exploit for it.
“The validator was checking for calls through
registers, as in this two-byte call:
83 e2 e0 and $0xffffffe0,%edx
ff d2 call %edx
but erroneously permitting similar calls through memory using this
addressing mode:
83 e2 e0 and $0xffffffe0,%edx
ff 12 call *(%edx)"
Ironically enough, the example given in the research paper used memory addressing but also stated “we disallow memory addressing modes on indirect jmp and call instructions“.
We can note the position of Google with this project: they are releasing a research paper and the source code of a very early, experimental project to share with the security and research communities, and the developers can also be reached on the discussion group. They encourage people to break the sandbox and credit the author of the first exploit. Ain’t it cool ?