Creating a toy virtual machine with PyPy
Here, you can use “virtual machine” as in “Java Virtual Machine”, not as in virtualization. We will play with the virtual machine described in the paper Tracing the Meta-Level: PyPy’s Tracing JIT Compiler by C.F. Bolz, A. Cuni, M. Fijalkowski and A. Rigo (it’s a great read by the way).
PyPy is a fascinating project, too complex to describe here. Among other things, PyPy can take any interpreter written in a subset of Python, translate it to C, and automatically generate a JIT compiler for this language. Does it sound too good to be true? Let’s try this.
- grab PyPy source code
- create the interpreter in pypy/translator/goal/target-toy.py with the following code:
</pre>
<pre style="text-align: justify;">import os, sys
import autopath
import py
# these are the opcodes for the interpreted language
JUMP_IF_A = 1
MOV_A_R = 2
MOV_R_A = 3
ADD_R_TO_A = 4
DECR_A = 5
RETURN_A = 6
<em>from pypy.rlib.jit import JitDriver tlrjitdriver = JitDriver(greens = ['pc', 'bytecode'], reds = ['a', 'regs'])</em>
# the main interpreter loop
def interpret(bytecode, a):
regs = [0] * 256
pc = 0
while True:
<em> tlrjitdriver.jit_merge_point(bytecode=bytecode, pc=pc, a=a, regs=regs) </em> opcode = bytecode[pc]
pc += 1
if opcode == JUMP_IF_A:
target = bytecode[pc]
pc += 1
if a:
<em> if target<pc: tlrjitdriver.can_enter_jit(bytecode=bytecode, pc=target, a=a, regs=regs) </em> pc = target
elif opcode == MOV_A_R:
n = bytecode[pc]
pc += 1
regs[n] = a
elif opcode == MOV_R_A:
n = bytecode[pc]
pc += 1
a = regs[n]
elif opcode == ADD_R_TO_A:
n = bytecode[pc]
pc += 1
a += regs[n]
elif opcode == DECR_A:
a -= 1
elif opcode == RETURN_A:
return a
# __________ Entry point __________
def entry_point(argv):
# the program we want to interpret
# it computes the square of its argument
bytecode = [
MOV_A_R, 0, # i = a
MOV_A_R, 1, # copy of ’a’
# 4:
MOV_R_A, 0, # i--
DECR_A,
MOV_A_R, 0,
MOV_R_A, 2, # res += a
ADD_R_TO_A, 1,
MOV_A_R, 2,
MOV_R_A, 0, # if i!=0: goto 4
JUMP_IF_A, 4,
MOV_R_A, 2,
RETURN_A
]
result = interpret(bytecode, int(argv[1]))
print result
return 0
def jitpolicy(driver):
from pypy.jit.metainterp.policy import JitPolicy
return JitPolicy()
# _____ Define and setup target ___
def target(*args):
return entry_point, None
# main function, if this script is called from the command line
if __name__ == '__main__':
entry_point(sys.argv)</pre>
<div><span style="color: #000000;">
- the lines in italic are the annotations for the JIT compiler. We need to give PyPy some insight on the interpreted language by declaring what is the instruction pointer (the green variables), the beginning of the dispatch loop and the backward branches (see the paper for full details).
- check that you can execute this script correctly by running
python target-toy.py 12, the output should be 144 - PyPy can translate this script in C. For this, first install the dependencies and then run the following command:
python translate.py target-toy.py - this should give you an executable target-toy-c, rename it target-toy-native and check that
./target-toy-native 12yields 144 - now we can ask PyPy to translate target-toy.py in C and generate a JIT compiler for it. For this, we just run
python translate.py --opt=jit target-toy.py- note: the 64-bit backend of PyPy is not implemented yet, so if you are on a 64-bit system, you will have to struggle a bit. You will have to use a 32-bit Python interpreter (see my former post), create an alias for gcc -m32 (let’s call it gcc32) and then pass the option –cc=gcc32 to translate.py.
- this should give you another target-toy-c executable, rename it to target-toy-jit and check that
./target-toy-jit 12 yields 144
Ok, everything is working, so let’s now see how all this performs by computing large squares:
~/pypy-trunk/pypy/translator/goal$ time python target-toy.py 1000000
1000000000000
real 0m18.637s
~/pypy-trunk/pypy/translator/goal$ time ./target-toy-native 1000000
-727379968
real 0m0.024s
~/pypy-trunk/pypy/translator/goal$ time ./target-toy-jit 1000000
-727379968
[...]
real 0m0.005s
The first run is the square program interpreted by our program, itself interpreted by the Python interpreter. Double interpretation is slow.
The second run is the square program interpreted by a native version of our interpret function. Interpretation by native code is ok.
The third run is the square program interpreted and JIT’ed on the fly. It’s super awesome :)
Final note: I must thank everybody from #pypy on freenode, for their help and resilience to stupid questions. Thanks guys!
Interesting, short but nice example
Ange
February 8, 2010 at 23:03
thanks :)
dan
February 9, 2010 at 08:52
Small question, it appears that the native pypy runs are returning a different value at the end? Have I missed something?
Alex Gaynor
February 8, 2010 at 23:25
That’s right, because the C and C-jit versions are 32-bit binaries and the translation process probably does not take into account the overflow.
dan
February 9, 2010 at 08:50
Hey.
RPython has well documented overflow semantics – you can get overflow detection, but not automatic promotion to long. Here is the description http://codespeak.net/pypy/dist/pypy/doc/coding-guide.html#integer-types
fijal
February 9, 2010 at 15:02
It seems a little unfair to do the comparison in that way, surely arithmetic with more digits is more computationally expensive.
Alex Gaynor
February 9, 2010 at 15:26
Will you be doing more articles about using PyPy?
Donny Viszneki
February 9, 2010 at 06:50
Probably. Right now I’m just playing with it, but I’d like to dive deeper.
dan
February 9, 2010 at 08:52
It’s great to get the wrong answer quickly :>
maht
February 9, 2010 at 14:04
PyPy FAILED :
http://permalink.gmane.org/gmane.comp.lang.lua.general/62915
adesnos
February 10, 2010 at 16:27
is it really necessary to sound like you’re commenting on youtube?
dan
February 10, 2010 at 16:53
follow-up:
http://comments.gmane.org/gmane.comp.lang.lua.general/62926
dan
February 11, 2010 at 09:00
Yep finally …
adesnos
February 11, 2010 at 09:15
[...] Creating a toy virtual machine with PyPy. The PyPy project is pretty cool. [...]
Beej's Bit Bucket » Linkfest
February 11, 2010 at 19:15