April 21st, 2016, 22:18 UTC

Python

sets versus lists: In Python, sets are faster than lists

Counter-intuitively, building a set and testing an item for membership in that set is faster doing the same with a list.

$ python3 -m timeit -c '2 in {1,2}'
10000000 loops, best of 3: 0.0259 usec per loop
$ python3 -m timeit -c '2 in [1,2]'
10000000 loops, best of 3: 0.0335 usec per loop

With the overhead of hashing and the more complex data structure I expected a set to be slower than a linear search of a short list, in this example at least, but it turns out that sets are faster for even just a couple of elements, and it only gets worse as the element count goes up.

The one time a list is faster is when the element being looked for is the very first:

$ python3 -m timeit -c '1 in [1,2]'
10000000 loops, best of 3: 0.0213 usec per loop

YMMV. For example, objects with custom __hash__ implementations may skew the results back in favour of using a list or tuple. Running within pypy may even eliminate the difference altogether.