Not much. Except the purpose, of course, but fundamentally they are both dynamic code generators. That means that if you run an automatic unpacker on a just-in-time compiler along with a source program, the output of the automatic unpacker should be the entrypoint of the native code in memory. Let’s check that: we use PIN to log memory writes and instruction pointers and then a python script to find the intersection.
For the JIT we’ll use LLVM and a simple fibonacci program (intendedly naive):
#include <stdio.h> #include <stdlib.h> int fibonacci(int curr) { if(curr < 2) return curr; else return(fibonacci(curr-1) + fibonacci(curr-2)); } int main(int argc, char** argv) { printf("fibonacci(%d) = %d\n", 20, fibonacci(20)); }
Let’s compile this program:
$ llvm-gcc -O3 -emit-llvm fib.c -c -o fib.bc
And run it, with JIT (by default) and without (forced interpretation):
$ time lli fib.bc fibonacci(20) = 6765 real 0m0.059s user 0m0.044s sys 0m0.004s $ time lli -force-interpreter fib.bc fibonacci(20) = 6765 real 0m0.192s user 0m0.180s sys 0m0.000s
The time difference between interpretation and JIT compilation becomes a lot more spectacular around fibonacci(40) but we don’t want to get huge traces. Now let’s write the pintool that logs the memory writes and the instruction pointers (it is fairly simple as it is based on two examples in PIN’s manual):
// This function is called before every instruction is executed // and prints the IP VOID printip(VOID *ip) { fprintf(itrace, "%p\n", ip); } // Print a memory write record VOID RecordMemWrite(VOID * ip, VOID * addr) { fprintf(pinatrace,"%p: W %p\n", ip, addr); } // Pin calls this function every time a new instruction is encountered VOID Instruction(INS ins, VOID *v) { // instruments stores using a predicated call, i.e. // the call happens iff the store will be actually executed if (INS_IsMemoryWrite(ins)) { INS_InsertPredicatedCall( ins, IPOINT_BEFORE, (AFUNPTR)RecordMemWrite, IARG_INST_PTR, IARG_MEMORYWRITE_EA, IARG_END); } else { // Insert a call to printip before every instruction, and pass it the IP INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printip, IARG_INST_PTR, IARG_END); } }
Now let’s trace LLVM with this pintool:
$ time pin -t mixtrace.so -- lli fib.bc fibonacci(20) = 6765 real 1m4.304s user 0m41.931s sys 0m3.816s
We’ll use the following python script to find the intersection between the memory writes and the instruction pointers:
def main(): pinatrace = args[0] itrace = args[1] # the parsing functions are omitted, but they return sets # (unordered collections of unique elements) writes = parse(pinatrace) eips = iparse(itrace) inter = writes & eips for hit in inter: print "dynamic code at 0x%X" % hit if len(inter) == 0: print "no hits found, the binary does not contain dynamic code" if __name__ == "__main__": main()
And finally, the result:
$ tracesurfer.py pinatrace.out itrace.out parsing pinatrace.out done, parsed 116965 memory writes. parsing itrace.out done, parsed 193618 instruction pointers dynamic code at 0x7F07B3555202 ... dynamic code at 0x7F07B35551F9
And for the sake of completude, let’s see what we obtain without the JIT:
$ time pin -t mixtrace.so -- lli -force-interpreter fib.bc fibonacci(20) = 6765 real 4m10.077s user 1m42.538s sys 0m18.533s # in case you wonder, the traces are *big* $ ls -l *.out -rw-r--r-- 1 reynaudd reynaudd 3174401813 2009-02-16 18:07 itrace.out -rw-r--r-- 1 reynaudd reynaudd 4055787304 2009-02-16 18:07 pinatrace.out $ tracesurfer.py pinatrace.out itrace.out parsing pinatrace.out done, parsed 27582 memory writes. parsing itrace.out done, parsed 73332 instruction pointers no hits found, the binary does not contain dynamic code
And the difference is even less for packers which first translate the program in an (randomly generated) intermediate instruction set and then JIT it at runtime (AFAIK Themida does something like this).
Mhhh, if they JIT it, it kind of self-defeats the translation to the intermediate representation. In my opinion, reversing the interpreted code would be far more difficult (and would prevent automatic unpacking).
The JIT-ed code probably depends on a lot of “library” functions. Also, probably their JIT doesn’t do optimizations, so it won’t really generate the original machine code under any situation.
But this is just speculation, I never actually took a look at it.
Ok, so that would be a sophisticated way to perform obfuscations. Keep me informed if you do the actual analysis, there are certainly interesting things to dig.