I was told to use ”.join([]) instead of the ‘+’ operator in Python. However a (bad) benchmark showed ‘+’ to be a lot faster. I think it is reasonable to say that in some cases ‘+’ is faster, here is my test:
def test0(b, c, d, e, f): for i in xrange(10**7): a = b + c + d + e + f print(a) def test1(): l = ['hello ', 'world ', 'with ', '+ ', 'operator'] for i in xrange(10**7): a = '' for j in l: a += j print(a) def test2(): l = ['hello', 'world', 'with', 'join', 'function'] for i in xrange(10**7): a = ' '.join(l) print(a) test0('hello ', 'world ', 'with ', '+ ', 'operator') test1() test2()
And the result of the test:
$ python -m cProfile -s cumulative test.py hello world with + operator hello world with + operator hello world with join function 10000007 function calls in 14.968 CPU seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) ... 1 6.838 6.838 6.838 6.838 test.py:7(test1) 1 2.683 2.683 5.113 5.113 test.py:15(test2) 1 3.016 3.016 3.016 3.016 test.py:2(test0)
So clearly the worst way of using ‘+’ is when iterating over a list of strings and accumulating the concatenations in a variable (function test1). But there is nothing wrong with performing multiple ‘+’ operations in a single line and then storing the result in a variable (function test0).
A quick look at the bytecode of the function confirms this intuition, we can see a bunch of LOADs and ADDs and only one STORE:
>>> import dis >>> dis.dis(test0) ... 19 LOAD_FAST 0 (b) 22 LOAD_FAST 1 (c) 25 BINARY_ADD 26 LOAD_FAST 2 (d) 29 BINARY_ADD 30 LOAD_FAST 3 (e) 33 BINARY_ADD 34 LOAD_FAST 4 (f) 37 BINARY_ADD 38 STORE_FAST 6 (a)
The test was performed with Python 2.5.4 on a Debian sid. Would be nice to see if the results hold for new versions of the Python interpreter.