How old is Dynamic Binary Instrumentation?

So I was recently reading TAOCP (note for later: find a way to not be smug about it), and stumbled on this gem: Trace Routines. When a machine is being simulated on itself (…), we have the special case of a simulator called a trace or monitor routine. Such programs are occasionally used  to help in debugging, since they print out a step-by-step account of how the simulated program behaves.

This is about as exciting as it gets for somebody versed in tracing and dynamic binary instrumentation, especially since this volume of TAOCP was published in 1968. The algorithm that follows looks exactly like what you would find in today’s dynamic binary rewriters (saving and restoring application registers, decoding and executing the instruction while retaining control on jumps). There is also a mention regarding what we now call transparency:

[the program being traced] must not store anything into the locations used by the trace program

The bit that achieved to kill me was exercise 6, “design a trace routine capable of tracing itself” (I happen to be very sensitive when somebody mentions self-applying things). I spent a few years working on this, how come it never came up before? I looked again at references in the few standard papers about DBI, namely:

  • Luk et al., Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation, 2005
  • Bruening, Efficient, Transparent, and Comprehensive Runtime Code Manipulation, 2004
  • Bala et al., Dynamo: a transparent dynamic optimization system, 2000
  • Nethercote and Seward, Valgrind: a framework for heavyweight dynamic binary instrumentation, 2007
  • Hunt and Brubacher, Detours: Binary Interception of Win32 Functions, 1999

There is no mention of Knuth, and with one exception, the oldest references date from the 90s. The exception comes from the Detours paper, with an incredible reference to Gill, The Diagnosis of Mistakes in Programmes on the EDSAC, 1951:

The second technique will be referred to as the ‘step-by-step’ technique.  In this, the  control unit  of  the  machine never  obeys  any  of  the  orders of  the  original programme directly.  The  machine remains under  the  control  of  the  checking routine, which is  so  constructed that  the  orders of  the  original programme are examined one by one, and carried out in exactly the same manner and sequence as if  they  were being  obeyed  directly.  If  this  were all  that  the  checking routine accomplished, it  would be merely a means of  carrying out machine operations in slow motion-slowed  down, in fact, by a factor of the order of  10. The reason for adopting this type of operation is that it is now open to the author of the checking routine to insert additional orders into it,  causing it  to print useful information as it proceeds. This information may be chosen so as to  assist in the investigation of either order or numerical failures.

So to answer the original question in this post: software-based tracing seems to be just as old as software itself.


Do you really have to throw up to be a researcher?

Through me you pass into the city of woe:
Through me you pass into eternal pain:
Through me among the people lost for aye.

— Dante Alighieri

I recently listened to a talk about somebody’s experience as a computer science researcher. The argument was that research is ultra-competitive, because you fight against people that are smarter than you and work harder than you (think nights and weekends) for very few available positions. Therefore, you have to consider yourself as a professional athlete, i.e. choosing a lifestyle that serves your only purpose in life: succeeding. Also, professional athletes throw up when they train really hard but then they train even harder. So much for elegance.

Needless to say, this vision of life is disgusting from every point of view. Ok, in academic research the hours are long and the pay is low, but what about the joys one could find in science? There was nothing about the beauty of a proof, the elegance of a piece of code or the fun of technology. Apparently there is nothing more to life than getting a better position, and to me this is about as un-inspirational as it gets.

Obfuscation Patterns in Academic Papers

These techniques seem to work, since I stumble quite frequently upon them in papers accepted at important venues:

  • misleading claim — come up with your own definition of a widely accepted term, preferably in a way inconsistent with the intuition and the general understanding of the term. Then claim spectacular results that happen to be true only with respect to your definition.
  • marketing pitch — use data irrelevant to the scientific discussion. Bonus: cite a dubious, hard-to-verify source with a non-existent methodology. Ex: “Problem X costs more than Y gazillion $$$ per year (hence our work is of epic importance)”.
  • over-sophistication —  use overly sophisticated terms for techniques or ideas that would otherwise be simple to understand (especially if it makes them look better). Ex: “heuristics” [1] sounds a lot better than “hack”.
  • no limitations — there are no limitations to your work, only things that are either out-of-scope or left for future work.
  • outsourcing problems — outsource large portions of the hard problems in your area by “leveraging” (see over-sophistication) existing work from a different domain with a complex literature, just to make sure that nobody can grasp all the dependencies of your paper.
  • biased evaluation — perform experimental evaluation designed to make others look bad. Even better if the evaluation is not reproducible, because you fail to disclose your source code, detailed algorithm or input set.

[1] heuristics (noun — hyu’-ris-tik): algorithm that does not work.

The Halting Problem for Reverse Engineers

I often have the feeling that technically savvy people don’t have a very high opinion of academia, and this is particularly true of security people. They have to deal with low-level details such as hardware architecture, operating systems internals (more or less documented), proprietary protocols and data structures, all of which require very specialized knowledge. Logic, theorems and algorithms don’t have a predominant place in that picture.

For the past few years I have been working on these subjects, and found some fundamental theorems to be actually *useful* in understanding the security properties of computer architectures. One of them is, of course, the undecidability of the halting problem. In essence, it says that we can not know if the computation of a program on some input will ever terminate. So what? Who cares if a program terminates, what we want is to find vulnerabilities and unpack malware samples, right?

The importance of the undecidability of the halting problem lies in its generality. In particular we can see Rice’s theorem as a generalization of this result, and to put it very simply, it says that whatever properties of programs you’re interested in, no program can tell if this property holds for every program (i.e. it is undecidable).

This is very bad news for all of us, since basically everything about programs is undecidable. Say that you are interested in finding functions that do out-of-bounds memory writes (or as I said, any other property), Rice’s theorem says that there is no program that will give you a correct answer all the time. You must accept that your program sometimes fails or infinitely loops.

I want to emphasize how bad this is. Do not let the terminology used in the theorems confuse you. In particular, the notions of input, output, and function computed by a program do not map nicely to binaries. An output is anything that gets modified by your program — any register or memory location, as soon as it is touched by an instruction, is an output. And basically, everything about outputs is undecidable. As a consequence, simple tasks such as disassembling are undecidable.

For instance, take this seemingly innocent indirect jump:

jmp [eax]

If eax is an output of instructions before it, no luck, its value is undecidable. You can run the program, write the value down, and assume it will not change, but you have no guarantee that it will not change at a given date. Undecidable. You could argue that eax can only take a finite number of values, and hence disassembling is still possible, just very intractable. But that would be without counting on self-modifying code. SMC (think packers) is the scourge of disassemblers because it gives the ability to transfer control to an output. Since I can’t decide the value of the output, I can’t disassemble.

To sum things up, here are a few direct consequences of the undecidability of the halting problem:

  1. you can’t decide the target of indirect jumps, reads and writes
  2. you can not decide if a particular memory address is code, data, or both
  3. you can’t decide values written in memory
  4. you can’t decide the numbers of occurrences of loops
  5. you can’t decide if control flow can reach a given instruction
  6. whatever you see in a given run can change arbitrarily in another run
  7. disassembling is undecidable
  8. unpacking is undecidable

I will leave how all this led to the bad habit of relying on “heuristics” to a further post. Stay classy!

Headin’ Out Californee Way

Well, I have not been very active on this blog lately. Since my last post I got married, defended my PhD (in front of a superlatively awesome committee, really) and left the snowy plains of Lorraine for California, among other things. I started my new job as a research staff member in Dawn Song‘s group in Berkeley, so I will probably keep complaining about malware, vulnerabilities, the price of housing and pretty much everything else on this blog.

Also, I might actually start training to become a cage fighter and learn emacs — not sure which one is less painful.

On the unsolvability of program behaviour

This is spectacularly well stated:

Due to the unsolvability of the halting problem (and nearly any other question concerning program behaviour), no analysis that always terminates can be exact. Therefore we only have three alternatives:

  • Consider systems with a finite number of finite behaviours (e.g. programs without loops) […]
  • Ask interactively for help in case of doubt […]
  • Accept approximate but correct information.

Neil D. Jones and Flemming Nielson, Abstract interpretation: a semantics-based tool for program analysis [PS]