Python Idioms: + versus join

I was told to use ”.join([]) instead of the ‘+’ operator in Python. However a (bad) benchmark showed ‘+’ to be a lot faster. I think it is reasonable to say that in some cases ‘+’ is faster, here is my test:

def test0(b, c, d, e, f):
    for i in xrange(10**7):
        a = b + c + d + e + f
    print(a)

def test1():
    l = ['hello ', 'world ', 'with ', '+ ', 'operator']
    for i in xrange(10**7):
        a = ''
	for j in l:
            a += j
    print(a)

def test2():
    l = ['hello', 'world', 'with', 'join', 'function']
    for i in xrange(10**7):
	a = ' '.join(l)
    print(a)

test0('hello ', 'world ', 'with ', '+ ', 'operator')
test1()
test2()

And the result of the test:

$ python -m cProfile -s cumulative test.py
hello world with + operator
hello world with + operator
hello world with join function

   10000007 function calls in 14.968 CPU seconds
   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        ...
        1    6.838    6.838    6.838    6.838 test.py:7(test1)
        1    2.683    2.683    5.113    5.113 test.py:15(test2)
        1    3.016    3.016    3.016    3.016 test.py:2(test0)

So clearly the worst way of using ‘+’ is when iterating over a list of strings and accumulating the concatenations in a variable (function test1). But there is nothing wrong with performing multiple ‘+’ operations in a single line and then storing the result in a variable (function test0).

A quick look at the bytecode of the function confirms this intuition, we can see a bunch of LOADs and ADDs and only one STORE:

>>> import dis
>>> dis.dis(test0)
...
             19 LOAD_FAST                0 (b)
             22 LOAD_FAST                1 (c)
             25 BINARY_ADD          
             26 LOAD_FAST                2 (d)
             29 BINARY_ADD          
             30 LOAD_FAST                3 (e)
             33 BINARY_ADD          
             34 LOAD_FAST                4 (f)
             37 BINARY_ADD          
             38 STORE_FAST               6 (a)

The test was performed with Python 2.5.4 on a Debian sid. Would be nice to see if the results hold for new versions of the Python interpreter.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s