notebook.community



In [1]:

    
%autosave 10









    














    



Autosaving every 10 seconds

Look inside the optimizing subfolder of this folder.

Also there's a massive 80 page handout. Handout contains the bulk of material for this tutorial, so run through

He didn't get past profiling, so look at the handout and type it all up into this IPython Notebook. Make it comprehensive.

How fast is fast enough?

Make sure it's slow. If so, does it matter if it's slow?
Why is it slow?
Don't optimize until you know you need it to be faster.
Focus on use cases and user experience.
Profile.
Maintain unit tests and run them as you optimize.

Usually improving big-O of algorithms and data structures is the best first step.

Hardware is cheaper than programmer time.

Benchmark your Python interpreter



In [4]:

    
import test.pystone
test.pystone.main()









    



Pystone(1.1) time for 50000 passes = 0.883939
This machine benchmarks at 56565 pystones/second

IronPython is faster than CPython
PyPy is significantly faster than IronPython and PyPy.

Profiling CPU usage

cProfile (recommended)
profile (pure Python, not recommended)
hotshot (deprecated)

But these all will affect performance of your program as you run them.

Below are many ways of running cProfiler.



In [12]:

    
%load optimizing/measuring/profile_me.py

the %%prun magic incantation shows output in the bottom pane of your IPython Notebook.



In [13]:

    
%%prun
# file profile_me.py

"""Example to be profiled.
"""

import time


def fast():
    """Wait 0.001 seconds.
    """
    time.sleep(1e-3)


def slow():
    """Wait 0.1 seconds.
    """
    time.sleep(0.1)


def use_fast():
    """Call `fast` 100 times.
    """
    for _ in xrange(100):
        fast()


def use_slow():
    """Call `slow` 100 times.
    """
    for _ in xrange(100):
        slow()


if __name__ == '__main__':
    use_fast()
    use_slow()



In [23]:

    
%timeit slow()









    



10 loops, best of 3: 100 ms per loop



In [24]:

    
%timeit fast()









    



1000 loops, best of 3: 1.15 ms per loop



In [20]:

    
import cProfile



In [21]:

    
cProfile.runctx("slow()", globals(), locals())









    



         4 function calls in 0.100 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.100    0.100 <string>:1(<module>)
        1    0.000    0.000    0.100    0.100 <string>:16(slow)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.100    0.100    0.100    0.100 {time.sleep}



In [36]:

    
cProfile.run("slow()")









    



         4 function calls in 0.101 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.101    0.101 <string>:1(<module>)
        1    0.000    0.000    0.101    0.101 <string>:16(slow)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.101    0.101    0.101    0.101 {time.sleep}



In [22]:

    
cProfile.runctx("fast()", globals(), locals())









    



         4 function calls in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        1    0.000    0.000    0.001    0.001 <string>:10(fast)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.001    0.001    0.001    0.001 {time.sleep}



In [37]:

    
cProfile.run("use_fast()", "optimizing/fast.stats")
import pstats
stats = pstats.Stats("optimizing/fast.stats")
stats.print_stats()









    



Fri Feb 21 13:55:09 2014    optimizing/fast.stats

         203 function calls in 0.112 seconds

   Random listing order was used

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.111    0.001    0.111    0.001 {time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.112    0.112 <string>:22(use_fast)
      100    0.001    0.000    0.112    0.001 <string>:10(fast)
        1    0.000    0.000    0.112    0.112 <string>:1(<module>)








    Out[37]:





<pstats.Stats instance at 0x10f50e1b8>

Or sort by time



In [38]:

    
stats.sort_stats("time").print_stats()









    



Fri Feb 21 13:55:09 2014    optimizing/fast.stats

         203 function calls in 0.112 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.111    0.001    0.111    0.001 {time.sleep}
      100    0.001    0.000    0.112    0.001 <string>:10(fast)
        1    0.000    0.000    0.112    0.112 <string>:22(use_fast)
        1    0.000    0.000    0.112    0.112 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}








    Out[38]:





<pstats.Stats instance at 0x10f50e1b8>

Or print out who is calling a certain function:



In [39]:

    
stats.print_callers("fast")









    



   Ordered by: internal time
   List reduced from 5 to 2 due to restriction <'fast'>

Function               was called by...
                           ncalls  tottime  cumtime
<string>:10(fast)      <-     100    0.001    0.112  <string>:22(use_fast)
<string>:22(use_fast)  <-       1    0.000    0.112  <string>:1(<module>)








    Out[39]:





<pstats.Stats instance at 0x10f50e1b8>

Or who is calling you:



In [40]:

    
stats.print_callees("use_fast")









    



   Ordered by: internal time
   List reduced from 5 to 1 due to restriction <'use_fast'>

Function               called...
                           ncalls  tottime  cumtime
<string>:22(use_fast)  ->     100    0.001    0.112  <string>:10(fast)








    Out[40]:





<pstats.Stats instance at 0x10f50e1b8>



In [33]:

    
profiler = cProfile.Profile()



In [34]:

    
profiler.runcall(slow)



In [35]:

    
profiler.print_stats()









    



         3 function calls in 0.101 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.101    0.101 <string>:16(slow)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.101    0.101    0.101    0.101 {time.sleep}

Wall clock vs CPU time

We've measured wall clock time, time between start and top. How long was CPU actually occupied?
The timeit module's timer by default gives the CPU time, in particular timeit.default_timer().
cProfiler can be used to do both, and by default measures wall-clock time.
In Python 2 need to be clever to specify the highest precision timer available for Windows. Not necessary for Linux/Mac.



In [41]:

    
%load optimizing/measuring/cpu_time.py



In [44]:

    
# file: cpu_time.py

"""Measuring CPU time instead of wall clock time.
"""

import cProfile
import os
import sys
import time

# Make it work with Python 2 and Python 3.
if sys.version_info[0] < 3:
    range = xrange


# This is important for Python 2.
def cpu_time():
    """Function for cpu time. Os dependent.
    """
    if sys.platform == 'win32':
        return os.times()[0]
    else:
        return time.clock()


def sleep():
    """Wait 2 seconds.
    """
    time.sleep(2)


def count():
    """100 million loops.
    """
    for _ in range(int(1e8)):
        1 + 1


def test():
    """Run functions
    """
    sleep()
    count()


def clock_check():
    """Profile with wall clock and cpu time.
    """
    
    # wall clock time (first print block)
    profiler = cProfile.Profile()
    profiler.run('test()')
    profiler.print_stats()

    # cpu time (second print block)
    profiler = cProfile.Profile(cpu_time)
    profiler.run('test()')
    profiler.print_stats()

if __name__ == '__main__':
    clock_check()









    



         6 function calls in 5.888 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.001    2.001 <ipython-input-44-4eedabdd7a22>:25(sleep)
        1    3.886    3.886    3.886    3.886 <ipython-input-44-4eedabdd7a22>:31(count)
        1    0.000    0.000    5.888    5.888 <ipython-input-44-4eedabdd7a22>:38(test)
        1    0.000    0.000    5.888    5.888 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    2.001    2.001    2.001    2.001 {time.sleep}


         6 function calls in 3.659 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <ipython-input-44-4eedabdd7a22>:25(sleep)
        1    3.659    3.659    3.659    3.659 <ipython-input-44-4eedabdd7a22>:31(count)
        1    0.000    0.000    3.659    3.659 <ipython-input-44-4eedabdd7a22>:38(test)
        1    0.000    0.000    3.659    3.659 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {time.sleep}

Line profiling

pip install line_profiler

You need to decorate functions you want to call. Then Python script then only callable via kernprof.py.



In [52]:

    
!cat optimizing/measuring/profile_me_use_line_profiler.py
!kernprof.py -l -v optimizing/measuring/profile_me_use_line_profiler.py









    



# file profile_me_use_line_profiler.py

"""Example to be profiled.
"""

import time


def fast():
    """Wait 0.001 seconds.
    """
    time.sleep(1e-3)


def slow():
    """Wait 0.1 seconds.
    """
    time.sleep(0.1)

@profile
def use_fast():
    """Call `fast` 100 times.
    """
    for _ in xrange(100):
        fast()

@profile
def use_slow():
    """Call `slow` 100 times.
    """
    for _ in xrange(100):
        slow()


if __name__ == '__main__':
    use_fast()
    use_slow()
Wrote profile results to profile_me_use_line_profiler.py.lprof
Timer unit: 1e-06 s

File: optimizing/measuring/profile_me_use_line_profiler.py
Function: use_fast at line 20
Total time: 0.224622 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           @profile
    21                                           def use_fast():
    22                                               """Call `fast` 100 times.
    23                                               """
    24       101          532      5.3      0.2      for _ in xrange(100):
    25       100       224090   2240.9     99.8          fast()

File: optimizing/measuring/profile_me_use_line_profiler.py
Function: use_slow at line 27
Total time: 10.1059 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    27                                           @profile
    28                                           def use_slow():
    29                                               """Call `slow` 100 times.
    30                                               """
    31       101          836      8.3      0.0      for _ in xrange(100):
    32       100     10105048 101050.5    100.0          slow()



In [51]:

    
!cat optimizing/measuring/accumulate.py
!kernprof.py -l -v optimizing/measuring/accumulate.py









    



# file accumulate.py

"""Simple test function for line_profiler.
"""

@profile
def accumulate(iterable):
    """Accumulate the intermediate steps in summing all elements.

    The result is a list with the length of `iterable`.
    The last element is the sum of all elements of `iterable`
    >>>accumulate(range(5))
    [0, 1, 3, 6, 10]
    accumulate(range(10))
    [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
    """
    acm = [iterable[0]]
    for elem in iterable[1:]:
        old_value = acm[-1]
        new_value = old_value + elem
        acm.append(new_value)
    return acm


if __name__ == '__main__':
    accumulate(range(10))
    accumulate(range(100))
Wrote profile results to accumulate.py.lprof
Timer unit: 1e-06 s

File: optimizing/measuring/accumulate.py
Function: accumulate at line 6
Total time: 0.000835 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     6                                           @profile
     7                                           def accumulate(iterable):
     8                                               """Accumulate the intermediate steps in summing all elements.
     9                                           
    10                                               The result is a list with the length of `iterable`.
    11                                               The last element is the sum of all elements of `iterable`
    12                                               >>>accumulate(range(5))
    13                                               [0, 1, 3, 6, 10]
    14                                               accumulate(range(10))
    15                                               [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
    16                                               """
    17         2            8      4.0      1.0      acm = [iterable[0]]
    18       110          183      1.7     21.9      for elem in iterable[1:]:
    19       108          194      1.8     23.2          old_value = acm[-1]
    20       108          189      1.8     22.6          new_value = old_value + elem
    21       108          257      2.4     30.8          acm.append(new_value)
    22         2            4      2.0      0.5      return acm



In [50]:

    
!cat optimizing/measuring/calc.py
!kernprof.py -l -v optimizing/measuring/calc.py









    



#calc.py

"""Simple test function for line_profiler doing some math.
"""

import math


@profile
def calc(number, loops=1000):
    """Do some math calculations.
    """
    sqrt = math.sqrt
    for x in xrange(loops):
        x = number + 10
        x = number * 10
        x = number ** 10
        x = pow(x, 10)
        x = math.sqrt(number)
        x = sqrt(number)
        math.sqrt
        sqrt

if __name__ == '__main__':
    calc(100, int(1e5))
Wrote profile results to calc.py.lprof
Timer unit: 1e-06 s

File: optimizing/measuring/calc.py
Function: calc at line 9
Total time: 3.22063 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     9                                           @profile
    10                                           def calc(number, loops=1000):
    11                                               """Do some math calculations.
    12                                               """
    13         1            4      4.0      0.0      sqrt = math.sqrt
    14    100001       181809      1.8      5.6      for x in xrange(loops):
    15    100000       222135      2.2      6.9          x = number + 10
    16    100000       203972      2.0      6.3          x = number * 10
    17    100000       738871      7.4     22.9          x = number ** 10
    18    100000       878143      8.8     27.3          x = pow(x, 10)
    19    100000       312327      3.1      9.7          x = math.sqrt(number)
    20    100000       264540      2.6      8.2          x = sqrt(number)
    21    100000       218328      2.2      6.8          math.sqrt
    22    100000       200501      2.0      6.2          sqrt

If you're willing to accept a performance hit you can catch the NameError exception and define an empty profile decorator, and let your program be executable without kernprof.py.



In [53]:

    
try:
    @profile
    def dummy():
        """Needs to be here to avoid syntax error"""
        pass
except NameError:
    def profile(func):
        """Empty decorator if not under kernprof.py"""
        return func

Do local references of functions make a difference?

Often claimed to be much faster.



In [56]:

    
!cat optimizing/measuring/local_ref.py
!kernprof.py -l -v optimizing/measuring/local_ref.py
!kernprof.py -v optimizing/measuring/local_ref.py









    



# local_ref.py

"""Testing access to local name and name referenced on another module.
"""

import math

# If there is no decorator `profile`, make one that just calls the function,
# i.e. does nothing.
# This allows to call `kernprof` with and without the option `-l` without
# commenting or un-commentimg `@profile' all the time.
# You can add this to the builtins to make it available in the whole program.
try:
    @profile
    def dummy():
        """Needs to be here to avoid a syntax error.
        """
        pass
except NameError:
    def profile(func):
        """Will act as the decorator `profile` if it was already found.
        """
        return func


def local_ref(counter):
    """Access local name.
    """
    # make it local
    sqrt = math.sqrt
    for _ in xrange(counter):
        sqrt


def module_ref(counter):
    """Access name as attribute of another module.
    """
    for _ in xrange(counter):
        math.sqrt


@profile
def test(counter):
    """Call both functions.
    """
    local_ref(counter)
    module_ref(counter)

if __name__ == '__main__':
    test(int(1e6))
Wrote profile results to local_ref.py.lprof
Timer unit: 1e-06 s

File: optimizing/measuring/local_ref.py
Function: dummy at line 14
Total time: 0 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    14                                               @profile
    15                                               def dummy():
    16                                                   """Needs to be here to avoid a syntax error.
    17                                                   """
    18                                                   pass

File: optimizing/measuring/local_ref.py
Function: test at line 42
Total time: 2.81661 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    42                                           @profile
    43                                           def test(counter):
    44                                               """Call both functions.
    45                                               """
    46         1      1431722 1431722.0     50.8      local_ref(counter)
    47         1      1384886 1384886.0     49.2      module_ref(counter)

Wrote profile results to local_ref.py.prof
         8 function calls in 0.269 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.269    0.269 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 local_ref.py:20(profile)
        1    0.069    0.069    0.069    0.069 local_ref.py:26(local_ref)
        1    0.199    0.199    0.199    0.199 local_ref.py:35(module_ref)
        1    0.001    0.001    0.269    0.269 local_ref.py:4(<module>)
        1    0.000    0.000    0.268    0.268 local_ref.py:42(test)
        1    0.001    0.001    0.269    0.269 {execfile}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Note that using line profiling (-l) increases total run time by an order of magnitude
Note that the local reference is, proportionally, significantly faster, but is unlikely to be the CPU bottleneck in your application

Profiling Memory Usage

guppy heaves you heappy. pip install guppy.
!!AI this isn't working in the IPython Notebook. Maybe the way it calls into the IPython kernel doesn't let this work?



In [62]:

    
from guppy import hpy
h = hpy()



In [68]:

    
h.heap()









    Out[68]:





Partition of a set of 277252 objects. Total size = 37926632 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 125560  45 14174840  37  14174840  37 str
     1  65536  24  5839736  15  20014576  53 tuple
     2    778   0  2525680   7  22540256  59 dict of module
     3   2131   1  2323528   6  24863784  66 dict (no owner)
     4  16485   6  2110080   6  26973864  71 types.CodeType
     5  16251   6  1950120   5  28923984  76 function
     6   1717   1  1549256   4  30473240  80 type
     7   1713   1  1429272   4  31902512  84 dict of type
     8    675   0   642888   2  32545400  86 dict of class
     9   3547   1   624472   2  33169872  87 list
<756 more rows. Type e.g. '_.more' to view.>



In [69]:

    
biglist = range(1000000)



In [70]:

    
h.heap()









    Out[70]:





Partition of a set of 277274 objects. Total size = 37929160 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 125570  45 14176216  37  14176216  37 str
     1  65538  24  5839912  15  20016128  53 tuple
     2    778   0  2525680   7  22541808  59 dict of module
     3   2131   1  2323528   6  24865336  66 dict (no owner)
     4  16485   6  2110080   6  26975416  71 types.CodeType
     5  16251   6  1950120   5  28925536  76 function
     6   1717   1  1549256   4  30474792  80 type
     7   1713   1  1429272   4  31904064  84 dict of type
     8    675   0   642888   2  32546952  86 dict of class
     9   3549   1   624744   2  33171696  87 list
<756 more rows. Type e.g. '_.more' to view.>



In [72]:

    
h.heap()[0]









    Out[72]:





Partition of a set of 125655 objects. Total size = 14187984 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 125655 100 14187984 100  14187984 100 str

Here are some command-line examples of hpy. We've written our own decorator to track the memory change resulting from a function call.



In [74]:

    
!cat optimizing/measuring/memory_size_hpy.py
!python optimizing/measuring/memory_size_hpy.py









    



# file: memory_size_hpy.py

"""Measure the size of used memory with a decorator.
"""

import functools                                                #1

from guppy import hpy                                           #2

memory = {}                                                     #3


def measure_memory(function):                                   #4
    """Decorator to measure memory size.
    """

    @functools.wraps(function)                                  #5
    def _measure_memory(*args, **kwargs):                       #6
        """This replaces the function that is to be measured.
        """
        measurer = hpy()                                        #7
        measurer.setref()                                       #8
        inital_memory = measurer.heap().size                    #9
        try:
            res = function(*args, **kwargs)                     #10
            return res
        finally:                                                #11
            memory[function.__name__] = (measurer.heap().size -
                                         inital_memory)
    return _measure_memory                                      #12


if __name__ == '__main__':

    @measure_memory                                             #13
    def make_big(number):
        """Example function that makes a large list.
        """
        return range(number)                                         #14

    make_big(int(1e6))                                          #15
    print 'used memory', memory                                 #16
used memory {'make_big': 32124976}

And again we can use the same kernprof.py empty profile decorator as before if we wanted to no-op this decorator in code.



In [76]:

    
!cat optimizing/measuring/memory_growth_hpy.py
!python optimizing/measuring/memory_growth_hpy.py









    



# file memory._growth_hpy.py

"""Measure the memory growth during a function call.
"""

from guppy import hpy                                           #1


def check_memory_growth(function, *args, **kwargs):             #2
    """Measure the memory usage of `function`.
    """
    measurer = hpy()                                            #3
    measurer.setref()                                           #4
    inital_memory = measurer.heap().size                        #5
    function(*args, **kwargs)                                   #6
    return measurer.heap().size - inital_memory                 #7

if __name__ == '__main__':

    def test():
        """Do some tests with different memory usage patterns.
        """

        def make_big(number):                                   #8
            """Function without side effects.

            It cleans up all used memory after it returns.
            """
            return range(number)

        data = []                                               #9

        def grow(number):
            """Function with side effects on global list.
            """
            for x in xrange(number):
                data.append(x)                                  #10
        size = int(1e6)
        print 'memory make_big:', check_memory_growth(make_big,
                                                      size)     #11
        print 'memory grow:', check_memory_growth(grow, size)   #12

    test()
memory make_big: 1320
memory grow: 23998472

make_big is very small because CPython uses a reference-counted synchronous garbage collector. The unused value is immediately garbage collected.

Using pympler for memory profiling

Easier to compile for Python 2.7+



In [81]:

    
!cat optimizing/measuring/memory_growth_pympler.py
!python optimizing/measuring/memory_growth_pympler.py









    



# file memory_growth_pympler.py

"""Measure the memory growth during a function call.
"""

from pympler import tracker                                     #1


def check_memory_growth(function, *args, **kwargs):             #2
    """Measure the memory usage of `function`.
    """
    measurer = tracker.SummaryTracker()                         #3
    for _ in xrange(5):                                         #4
        measurer.diff()                                         #5    
    function(*args, **kwargs)                                   #6
    return measurer.diff()                                      #7

if __name__ == '__main__':

    def test():
        """Do some tests with different memory usage patterns.
        """

        def make_big(number):                                   #8
            """Function without side effects.

            It cleans up all used memory after it returns.
            """
            return range(number)

        data = []                                               #9

        def grow(number):
            """Function with side effects on global list.
            """
            for x in xrange(number):
                data.append(x)                                  #10
        size = int(1e6)
        print 'memory make_big:', check_memory_growth(make_big,
                                                      size)     #11
        print 'memory grow:', check_memory_growth(grow, size)   #12

    test()
memory make_big: []
memory grow: [['list', 0, 8697400], ['int', 999860, 23996640]]

!!AI why is make_big empty?

Different pympler ways of measuring size of an object



In [94]:

    
import sys
from pympler.asizeof import asizeof, flatsize

def list_mem(length, size_func=flatsize):
  """Measure incremental memory increase of a growing list.
  """
  my_list= []
  mem = [size_func(my_list)]
  for elem in xrange(length):
    my_list.append(elem)
    mem.append(size_func(my_list))
  return mem

SIZE = 1000
SHOW = 20



In [95]:

    
plot(list_mem(SIZE, size_func=flatsize))









    Out[95]:





[<matplotlib.lines.Line2D at 0x110418110>]



In [96]:

    
plot(list_mem(SIZE, size_func=asizeof))









    Out[96]:





[<matplotlib.lines.Line2D at 0x1111f6310>]



In [97]:

    
plot(list_mem(SIZE, size_func=sys.getsizeof))









    Out[97]:





[<matplotlib.lines.Line2D at 0x11048f290>]

sys.getsizeof is the same thing as pympler.flatsize.



In [102]:

    
%load optimizing/measuring/list_alloc_steps.py



In [ ]:

    
# file: list_alloc_steps.py

"""Measure the number of memory allocation steps for a list.
"""

import sys

from pympler.asizeof import flatsize


def list_steps(lenght, size_func=sys.getsizeof):
    """Measure the number of memory alloaction steps for a list.
    """
    my_list = []
    steps = 0
    int_size = size_func(int())
    old_size = size_func(my_list)
    for elem in xrange(lenght):
        my_list.append(elem)
        new_size = sys.getsizeof(my_list)
        if new_size - old_size > int_size:
            steps += 1
        old_size = new_size
    return steps


if __name__ == '__main__':
    print 'Using sys.getsizeof:'
    for size in [10, 100, 1000, 10000, int(1e5), int(1e6), int(1e7)]:
        print '%10d: %3d' % (size, list_steps(size))
    print 'Using pympler.asizeof.flatsize:'
    for size in [10, 100, 1000, 10000, int(1e5), int(1e6), int(1e7)]:
        print '%10d: %3d' % (size, list_steps(size, flatsize))



In [103]:

    
!python optimizing/measuring/list_alloc_steps.py









    



Using sys.getsizeof:
        10:   3
       100:  10
      1000:  27
     10000:  46
    100000:  65
   1000000:  85
  10000000: 104
Using pympler.asizeof.flatsize:
        10:   3
       100:  10
      1000:  27
     10000:  46
    100000:  65
   1000000:  85
  10000000: 104

There's a line memory profiler, look in the handout.



In [ ]: