In [1]:
%load_ext watermark
In [2]:
%watermark -d -v -u
[More information](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/ipython_magic/watermark.ipynb) about the `watermark` magic command extension.
A common use-case for the range
function is the classic for-loop. In Python 2.x, range
will create a list in memory and iterate over it. As you can imagine, its is not always very efficient to create a list (especially for a large number of iterations) if we just want to use it once.
Thus, the xrange
function was added later-on, which evaluates "lazily" similar to generators.
(In Python 3.x, the range
function was implemented similar to Python's xrange
).
But note that it using xrange
instead of range
does not always make sense. E.g., if we have a use-case where we want to iterate over the sequence (here: integers) multiple times, it would be more efficient to use range
, since xrange
is always constructing the values from scratch.
And also slicing is not supported by xrange
.
In [27]:
print 'range', type(range(3))
print 'xrange', type(xrange(3))
In the following benchmark, we will take a look at the speed advantages of range
over xrange
using a simple for-loop construct:
for i in range(some_int):
pass
Similar to the xrange
function, the izip
function (in the itertools
built-in module) evaluates the values in a list "lazily" without constructing a list of tuples like it's zip
counterpart.
In [4]:
from itertools import izip
print 'zip', type(zip([1,2], [3,4]))
print 'izip', type(izip([1,2], [3,4]))
Also here, we take a look at the speed benefits of using izip
over zip
for a simple for-loop construct:
for i in range(n, n): # where i is a tuple of 2 values
pass
In [16]:
import timeit
xrange_list, range_list, zip_list, izip_list = [], [], [], []
orders = [10**i for i in xrange(1, 9)]
for n in orders:
print('n=%s (%s of %s)' %(n, orders.index(n)+1, len(orders)))
range_list.append(min(timeit.Timer('for i in range(n): pass' ,
'from __main__ import n').repeat(repeat=5, number=1)))
xrange_list.append(min(timeit.Timer('for i in xrange(n): pass' ,
'from __main__ import n').repeat(repeat=5, number=1)))
zip_list.append(min(timeit.Timer('for i in zip(range(n), range(n)): pass' ,
'from __main__ import n').repeat(repeat=5, number=1)))
izip_list.append(min(timeit.Timer('for i in izip(range(n), range(n)): pass' ,
'from __main__ import n, izip').repeat(repeat=5, number=1)))
print('finished')
In [17]:
%matplotlib inline
In [24]:
from matplotlib import pyplot as plt
def plot():
def settings():
plt.xlim([min(orders) / 10, max(orders)* 10])
plt.grid()
plt.xticks(fontsize=16)
plt.yticks(fontsize=16)
plt.xscale('log')
plt.yscale('log')
plt.legend(loc='upper left', fontsize=14)
plt.xlabel('size of n', fontsize=16)
plt.ylabel('time in seconds (log-scale)', fontsize=16)
fig = plt.figure(figsize=(14,6))
plt.subplot(1,2,1)
plt.plot(orders, range_list, label='range(n)')
plt.plot(orders, xrange_list, label='xrange(n)')
plt.title('range and xrange in a for-loop', fontsize=22)
settings()
plt.subplot(1,2,2)
plt.plot(orders, zip_list, label='zip(n, n)')
plt.plot(orders, izip_list, label='izip(n, n)')
plt.title('zip and izip in a for-loop', fontsize=22)
settings()
plt.tight_layout()
plt.show()
In [25]:
plot()
In [28]:
%watermark -d -v -m
If we take the logarithmic scale, the "lazy evaluation" really pays off in simple for-loop constructs where we are only iterating over a given sequence once. Especially with increasing sizes of n
, we can obeserve that the construction of the list objects via range
and zip
becomes very inefficient for a one-time iteration use-case.
In [ ]: