So I was recently reading TAOCP (note for later: find a way to not be smug about it), and stumbled on this gem:
1.4.3.2. Trace Routines. When a machine is being simulated on itself (…), we have the special case of a simulator called a trace or monitor routine. Such programs are occasionally used to help in debugging, since they print out a step-by-step account of how the simulated program behaves.
This is about as exciting as it gets for somebody versed in tracing and dynamic binary instrumentation, especially since this volume of TAOCP was published in 1968. The algorithm that follows looks exactly like what you would find in today’s dynamic binary rewriters (saving and restoring application registers, decoding and executing the instruction while retaining control on jumps). There is also a mention regarding what we now call transparency:
[the program being traced] must not store anything into the locations used by the trace program
The bit that achieved to kill me was exercise 6, “design a trace routine capable of tracing itself” (I happen to be very sensitive when somebody mentions self-applying things). I spent a few years working on this, how come it never came up before? I looked again at references in the few standard papers about DBI, namely:
- Luk et al., Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation, 2005
- Bruening, Efficient, Transparent, and Comprehensive Runtime Code Manipulation, 2004
- Bala et al., Dynamo: a transparent dynamic optimization system, 2000
- Nethercote and Seward, Valgrind: a framework for heavyweight dynamic binary instrumentation, 2007
- Hunt and Brubacher, Detours: Binary Interception of Win32 Functions, 1999
There is no mention of Knuth, and with one exception, the oldest references date from the 90s. The exception comes from the Detours paper, with an incredible reference to Gill, The Diagnosis of Mistakes in Programmes on the EDSAC, 1951:
The second technique will be referred to as the ‘step-by-step’ technique. In this, the control unit of the machine never obeys any of the orders of the original programme directly. The machine remains under the control of the checking routine, which is so constructed that the orders of the original programme are examined one by one, and carried out in exactly the same manner and sequence as if they were being obeyed directly. If this were all that the checking routine accomplished, it would be merely a means of carrying out machine operations in slow motion-slowed down, in fact, by a factor of the order of 10. The reason for adopting this type of operation is that it is now open to the author of the checking routine to insert additional orders into it, causing it to print useful information as it proceeds. This information may be chosen so as to assist in the investigation of either order or numerical failures.
So to answer the original question in this post: software-based tracing seems to be just as old as software itself.