How old is Dynamic Binary Instrumentation?

So I was recently reading TAOCP (note for later: find a way to not be smug about it), and stumbled on this gem: Trace Routines. When a machine is being simulated on itself (…), we have the special case of a simulator called a trace or monitor routine. Such programs are occasionally used  to help in debugging, since they print out a step-by-step account of how the simulated program behaves.

This is about as exciting as it gets for somebody versed in tracing and dynamic binary instrumentation, especially since this volume of TAOCP was published in 1968. The algorithm that follows looks exactly like what you would find in today’s dynamic binary rewriters (saving and restoring application registers, decoding and executing the instruction while retaining control on jumps). There is also a mention regarding what we now call transparency:

[the program being traced] must not store anything into the locations used by the trace program

The bit that achieved to kill me was exercise 6, “design a trace routine capable of tracing itself” (I happen to be very sensitive when somebody mentions self-applying things). I spent a few years working on this, how come it never came up before? I looked again at references in the few standard papers about DBI, namely:

  • Luk et al., Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation, 2005
  • Bruening, Efficient, Transparent, and Comprehensive Runtime Code Manipulation, 2004
  • Bala et al., Dynamo: a transparent dynamic optimization system, 2000
  • Nethercote and Seward, Valgrind: a framework for heavyweight dynamic binary instrumentation, 2007
  • Hunt and Brubacher, Detours: Binary Interception of Win32 Functions, 1999

There is no mention of Knuth, and with one exception, the oldest references date from the 90s. The exception comes from the Detours paper, with an incredible reference to Gill, The Diagnosis of Mistakes in Programmes on the EDSAC, 1951:

The second technique will be referred to as the ‘step-by-step’ technique.  In this, the  control unit  of  the  machine never  obeys  any  of  the  orders of  the  original programme directly.  The  machine remains under  the  control  of  the  checking routine, which is  so  constructed that  the  orders of  the  original programme are examined one by one, and carried out in exactly the same manner and sequence as if  they  were being  obeyed  directly.  If  this  were all  that  the  checking routine accomplished, it  would be merely a means of  carrying out machine operations in slow motion-slowed  down, in fact, by a factor of the order of  10. The reason for adopting this type of operation is that it is now open to the author of the checking routine to insert additional orders into it,  causing it  to print useful information as it proceeds. This information may be chosen so as to  assist in the investigation of either order or numerical failures.

So to answer the original question in this post: software-based tracing seems to be just as old as software itself.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s