UPDATED: Don’t use set.add() in python when running cProfile

UPDATE: As pointed out in the comments, my original conclusion was wrong and is due to the effect of the profiler on the performance. When using the time command instead with Python 3.2 on OS X Lion (averaged over 3 runs), the version with set.add takes 1.37s versus 4.85s for the version with set union (results are 2.17s and 7.12s respectively with Python 2.7.1). Sorry for the “d’oh” moment. Take-home lesson: use a profiler to count events, not to time them.

Use the union operator instead, it’s 2 to 3x faster on my machine.

$ cat sets.py
def mister():
    s = set([1,2,3])
    for i in range(10**7):
        s.add(1)

def hankey():
    s = set([1,2,3])
    for i in range(10**7):
        s |= set([1])
mister()
hankey()

$ python -m cProfile sets.py
10000008 function calls in 28.257 CPU seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
 1 11.507 11.507 22.978 22.978 sets.py:1(mister) 1 5.138 5.138 5.278 5.278 sets.py:6(hankey)

(tested on Python 2.5, 2.6 and 3.1 on Windows and Linux)

About these ads

9 thoughts on “UPDATED: Don’t use set.add() in python when running cProfile

  1. On a Celeron M, 1.5 GHz, Python 2.6:
    ncalls tottime percall cumtime percall filename:lineno(function)
    1 15.340 15.340 27.984 27.984 z.py:1(mister)
    1 12.124 12.124 12.574 12.574 z.py:6(hankey)

    This is very odd, because for the union it also has to create a set() object, which should take more time by itself. Any idea why this happens?

    1. Hi vlad,

      Sure, I found this odd enough to post it ^^ The object oriented programmer in me told me “don’t create objects, rather use methods in existing objects”. I guess this is happening because sets are native objects for the interpreter and calling a function has a more important overhead.

      This is a not particularly educated guess though, I might be plain wrong.

  2. 9 40 LOAD_FAST 0 (s)
    43 LOAD_GLOBAL 0 (set)
    46 LOAD_CONST 1 (1)
    49 BUILD_LIST 1
    52 CALL_FUNCTION 1
    55 INPLACE_OR
    56 STORE_FAST 0 (s)
    59 JUMP_ABSOLUTE 34

    4 40 LOAD_FAST 0 (s)
    43 LOAD_ATTR 2 (add)
    46 LOAD_CONST 1 (1)
    49 CALL_FUNCTION 1
    52 POP_TOP
    53 JUMP_ABSOLUTE 34

    I guess __dict__ lookup(LOAD_ATTR) is VERY slow

  3. Really old post I know, but in case somebody else runs into this site.

    If you aren’t running the profiler on this code, the opposite is true. I timed the two versions using time python … and found that. The .add() is registered by the profiler and is recorded where as the |= is not. The difference you are measuring is the difference in overhead of running the profiler. The actual cost of adding an element to a set is so small it doesn’t really register.

    1. Hi, thanks for the update. As far as I remember, I actually noticed a speed-up in the overall application when using |=, but it is quite possible that it depends on a specific version of the interpreter.

      1. Ok. I have a very definite speed advantage in favor of add when not run under profiling. It is certainly quite possible the interpreter has changed since you wrote this.

  4. Winston Ewert is right on, the title on this article is completely deceiving. It should be “Don’t use set.add() in python when running profiler”

    1. Okay, I finally re-ran the tests without the profiler and it turns out you’re right. I have updated the post to reflect this. Thanks to you and Winston for pointing this out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s