JPC is a full PC emulator (à la Bochs but in Java), including the BIOS, VGA BIOS, floppy drive and other hardware components. Of particular interest to me, is the way x86 instructions are modeled and executed. It works and stays at the binary level (no fancy disassembly), and actually compiles x86 instructions to a simpler microcode language representing “atomic” instructions. This microcode language is then straightforward to execute, although a bit more complex than similar micro-languages (such as VEX, REIL or BIL).
The core of the x86 semantics is contained in the x86 to microcode compiler, found in org.jpc.emulator.memory.codeblock.optimised.ProtectedModeUDecoder.decodeOpcode()
. This method takes a binary x86 instruction and decodes its prefices, opcode, modrm, sib, displacement and immediate parameters. Then it delegates the translation of the microcode to this sequence of methods:
writeInputOperands(prefices, opcode, modrm, sib, displacement, immediate); writeOperation(prefices, opcode, modrm); writeOutputOperands(prefices, opcode, modrm, sib, displacement); writeFlags(prefices, opcode, modrm); <span style="color: #000000;">
For instance, if we take the binary instruction 04 42 (add al, 0x42), it is decoded with opcode = 0x04 and immediate = 0x42. Then based on these values, the instruction is translated to the following microcode sequence:
// writeInputOperands: LOAD0_AL LOAD1_IB 0x42 // writeOperation: ADD // writeOutputOperands: STORE0_AL // writeFlags: ADD_O8_FLAGS<span style="color: #ffffff;"> </span>
Now, understanding the semantics of an x86 instruction reduces to understanding the semantics of the microcode language. For this, we need the microcode interpreter, which is org.jpc.emulator.memory.codeblock.optimised.ProtectedModeUBlock.execute()
. It is a relatively simple execution language (execution-wise), with 5 general-purpose registers but with roughly 750 opcodes. The execution of the above microcodes translates to this Java sequence:
reg0 = cpu.eax & 0xff; reg1 = 0x42 & 0xff; reg2 = reg0; reg0 = reg2 + reg1; cpu.eax = (cpu.eax & ~0xff) | (reg0 & 0xff); cpu.setZeroFlag((byte)reg0); cpu.setParityFlag(reg0); cpu.setSignFlag((byte)reg0); cpu.setCarryFlag(reg0, Processor.CY_TWIDDLE_FF); cpu.setAuxiliaryCarryFlag(reg2, reg1, result, Processor.AC_XOR); cpu.setOverflowFlag(reg0, reg2, reg1, Processor.OF_ADD_BYTE);
Provided JPC API implements a large panel of x86 semantics (which seems: good to see old DOOM on JPC website), this solution is far better than using Vine to extract microinstructions:
– Multi-platform solution (+++)
– No need to use TEMU or any TEMU compatible trace format (permits trace compression)
– The extracted semantics seem to be concise, clean and easily compilable into a DFG