This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com) for UW's [Astro 599](http://www.astro.washington.edu/users/vanderplas/Astr599/) course. Source and license info is on [GitHub](https://github.com/jakevdp/2013_fall_ASTR599/).

Functions and Modules

But first, an important video...

Functions and Modules

An important part of coding (in Python and in other modern language) is organizing code in easily re-used chunks.

Python can work within both a procedural and an object-oriented style.

  • Procedural programming is using functions

  • Object-oriented programming is using classes

We'll come back to classes later, and look at functions now.

Functions

Function definitions in Python look like this:

def function_name(arg1, arg2, ...,
                  kw1=val1, kw2=val2, ...)

(Note that line-breaks between the parentheses are ignored)

argX are arguments, and are required

kwX are keyword arguments, and are optional

Functions

The function name can be anything, as long as it:

  • contains only numbers, letters, and underscores
  • does not start with a number
  • is not the name of a built-in keyword (like print, or for)

Note for IDL users: there is no difference between functions and procedures. All Python functions return a value: if no return is specified, it returns None

Some Examples

A function with two arguments:


In [4]:
def addnums(x, y):
    return x + y

In [5]:
result = addnums(1, 2)
print result


3

In [6]:
print addnums(1, y=2)


3

In [7]:
print addnums("A", "B")


AB

Note that the variable types are not declared (as we've discussed Python is a dynamic language)

Examples

A function with a keyword


In [8]:
def scale(x, factor=2.0):
    return x * factor

In [9]:
scale(4)


Out[9]:
8.0

In [11]:
scale(4, 10)


Out[11]:
40

In [12]:
scale(4, factor=10)


Out[12]:
40

Arguments and Keyword arguments can either be specified by order or by name, but an unnamed argument cannot come after a named argument:


In [13]:
scale(x=4, 10)


  File "<ipython-input-13-f0cbe9ee0750>", line 1
    scale(x=4, 10)
SyntaxError: non-keyword arg after keyword arg

Return Values

Returned values can be anything, which allows a lot of flexibility:


In [26]:
def build_dict(x, y):
    return {'x':x, 'y':y}

build_dict(4, 5)


Out[26]:
{'x': 4, 'y': 5}

In [27]:
def no_return_value():
    pass

x = no_return_value()
print x


None

Keyword Arguments

Keyword arguments can be a very handy way to grow new functionality without breaking old code.

Imagine, for example, you had the build_dict function from above:


In [28]:
def build_dict(x, y):
    return {'x':x, 'y':y}

build_dict(1, 2)


Out[28]:
{'x': 1, 'y': 2}

Now what if you want to change the names of the variables in the dictionary? Adding a keyword argument can allow this flexibility without breaking old code:


In [29]:
def build_dict(x, y, xname='x', yname='y'):
    return {xname:x, yname:y}

build_dict(1, 2)  # old call still works


{'y': 2, 'x': 1}

In [30]:
build_dict(1, 2, xname='spam', yname='eggs')


Out[30]:
{'eggs': 2, 'spam': 1}

This is admittedly a silly example, but it shows how keywords can be used to add flexibility without breaking old APIs.

Variable Scope

Python functions have their own local variables list:


In [14]:
def modify_x(x):
    x += 5
    return x

In [15]:
x = 10
y = modify_x(x)

print x
print y


10
15

Modifying a variable in the function does not modify the variable globally... unless you use the global declaration


In [25]:
def add_a(x):
    global a
    a += 1
    return x + a

a = 10
print add_a(5)
print a


16
11

Potential Gotcha: Simple vs Compound types

Warning: Simple and Compound types are treated differently!


In [18]:
def add_one(x):
    x += 1
    
x = 4
add_one(x)
print x


4

In [19]:
def add_element(L):
    L.append(4)
    
L = [1, 2]
add_element(L)
print L


[1, 2, 4]

Simple types (int, long, float, complex, string) are passed by value.

Compound types (list, dict, set, tuple, user-defined objects) are passed by reference.

Question to think about: why would this be?

Catch-all: *args and **kwargs


In [1]:
def cheeseshop(kind, *args, **kwargs):
    print "Do you have any", kind, "?"
    print "I'm sorry, we're all out of", kind
    
    for arg in args:
        print arg
        
    print 40 * "="
    
    for kw in kwargs:
        print kw, ":", kwargs[kw]

In [2]:
cheeseshop("Limburger", "It's very runny, sir.",
           "It's really very, VERY runny, sir.",
           shopkeeper="Michael Palin",
           client="John Cleese",
           sketch="Cheese Shop Sketch")


Do you have any Limburger ?
I'm sorry, we're all out of Limburger
It's very runny, sir.
It's really very, VERY runny, sir.
========================================
shopkeeper : Michael Palin
sketch : Cheese Shop Sketch
client : John Cleese

(example from Python docs)

Documentation ("doc strings")

Documentation is not required, but your future self (and anybody else using your code) will thank you.


In [38]:
def power_of_difference(x, y, p=2.0):
    """Return the power of the difference of x and y
    
    Parameters
    ----------
    x, y : float
        the values to be differenced
    p : float (optional)
        the exponent (default = 2.0)
    
    Returns
    -------
    result: float
        (x - y) ** p
    """
    diff = x - y
    return diff ** p

power_of_difference(10.0, 5.0)


Out[38]:
25

(Note that this example follows the Numpy documentation standard)

With documentation specified this way, the IPython help command will be helpful!


In [39]:
power_of_difference?

Automatically building HTML documentation


In [41]:
%%file myfile.py

def power_of_difference(x, y, p=2.0):
    """Return the power of the difference of x and y
    
    Parameters
    ----------
    x, y : float
        the values to be differenced
    p : float (optional)
        the exponent (default = 2.0)
    
    Returns
    -------
    result: float
        (x - y) ** p
    """
    diff = x - y
    return diff ** p


Writing myfile.py

In [42]:
# Pydoc is a command-line program bundled with Python
!pydoc -w myfile


wrote myfile.html

In [46]:
from IPython.display import HTML
HTML(open('myfile.html').read())


Out[46]:
Python: module myfile
 
 
myfile
index
/Users/jakevdp/Opensource/2013_fall_ASTR599/notebooks/myfile.py

 
Functions
       
power_of_difference(x, y, p=2.0)
Return the power of the difference of x and y
 
Parameters
----------
x, y : float
    the values to be differenced
p : float (optional)
    the exponent (default = 2.0)
 
Returns
-------
result: float
    (x - y) ** p

Any remaining questions about functions?

Modules

Modules are organized units of code which contain functions, classes, statements, and other definitions.

Any file ending in .py is treated as a module (e.g. our file myfile.py above).

Variables in modules have their own scope: using a name in one module will not affect variables of that name in another module.


In [49]:
%%file mymodule.py
# A simple demonstration module

def add_numbers(x, y):
    """add x and y"""
    return x + y

def subtract_numbers(x, y):
    """subtract y from x"""
    return x - y


Writing mymodule.py

Modules are accessed using import module_name (with no .py)


In [50]:
import mymodule

In [52]:
print '1 + 2 =', mymodule.add_numbers(1, 2)
print '5 - 3 =', mymodule.subtract_numbers(5, 3)


1 + 2 = 3
5 - 3 = 2

In [53]:
print add_numbers(1, 2)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-53-bd09daccae66> in <module>()
----> 1 print add_numbers(1, 2)

NameError: name 'add_numbers' is not defined

Several ways to import from modules

As a separate namespace:


In [54]:
import mymodule
mymodule.add_numbers(1, 2)


Out[54]:
3

Importing a single function or name:


In [55]:
from mymodule import add_numbers
add_numbers(1, 2)


Out[55]:
3

Renaming module contents


In [57]:
from mymodule import add_numbers as silly_function
silly_function(1, 2)


Out[57]:
3

The Kitchen sink:


In [56]:
from mymodule import *
subtract_numbers(5, 3)


Out[56]:
2

This final method can be convenient, but should generally be avoided as it can cause name collisions and makes debugging difficult.

Module level code and documentation

Your modules can have their own documentation, can define module-level variables, and can execute code when they load. For example:


In [58]:
%%file mymodule2.py
"""
Example module with some variables and startup code
"""
# this code runs when the module is loaded
print "mymodule2 in the house!"
pi = 3.1415926
favorite_food = "spam, of course"

def multiply(a, b):
    return a * b


Writing mymodule2.py

In [65]:
import mymodule2


mymodule2 in the house!

In [66]:
# import again and the initial code does not execute!
import mymodule2

In [67]:
# access module-level documentation
mymodule2?

In [61]:
print mymodule2.multiply(2, 3)
print mymodule2.pi


6

In [68]:
# module variables can be modified
print mymodule2.favorite_food


spam, of course

In [69]:
mymodule2.favorite_food = "eggs.  No spam."
print mymodule2.favorite_food


eggs.  No spam.

A Few Built-in Modules

  • sys: exposes interactions with the system (environment, file I/O, etc.)
  • os: exposes platform-specific operations (file statistics, directories, paths, etc.)
  • math: exposes basic mathematical functions and constants

In [72]:
import sys
help(sys)


Help on built-in module sys:

NAME
    sys

FILE
    (built-in)

MODULE DOCS
    http://docs.python.org/library/sys

DESCRIPTION
    This module provides access to some objects used or maintained by the
    interpreter and to functions that interact strongly with the interpreter.
    
    Dynamic objects:
    
    argv -- command line arguments; argv[0] is the script pathname if known
    path -- module search path; path[0] is the script directory, else ''
    modules -- dictionary of loaded modules
    
    displayhook -- called to show results in an interactive session
    excepthook -- called to handle any uncaught exception other than SystemExit
      To customize printing in an interactive session or to install a custom
      top-level exception handler, assign other functions to replace these.
    
    exitfunc -- if sys.exitfunc exists, this routine is called when Python exits
      Assigning to sys.exitfunc is deprecated; use the atexit module instead.
    
    stdin -- standard input file object; used by raw_input() and input()
    stdout -- standard output file object; used by the print statement
    stderr -- standard error object; used for error messages
      By assigning other file objects (or objects that behave like files)
      to these, it is possible to redirect all of the interpreter's I/O.
    
    last_type -- type of last uncaught exception
    last_value -- value of last uncaught exception
    last_traceback -- traceback of last uncaught exception
      These three are only available in an interactive session after a
      traceback has been printed.
    
    exc_type -- type of exception currently being handled
    exc_value -- value of exception currently being handled
    exc_traceback -- traceback of exception currently being handled
      The function exc_info() should be used instead of these three,
      because it is thread-safe.
    
    Static objects:
    
    float_info -- a dict with information about the float inplementation.
    long_info -- a struct sequence with information about the long implementation.
    maxint -- the largest supported integer (the smallest is -maxint-1)
    maxsize -- the largest supported length of containers.
    maxunicode -- the largest supported character
    builtin_module_names -- tuple of module names built into this interpreter
    version -- the version of this interpreter as a string
    version_info -- version information as a named tuple
    hexversion -- version information encoded as a single integer
    copyright -- copyright notice pertaining to this interpreter
    platform -- platform identifier
    executable -- absolute path of the executable binary of the Python interpreter
    prefix -- prefix used to find the Python library
    exec_prefix -- prefix used to find the machine-specific Python library
    float_repr_style -- string indicating the style of repr() output for floats
    __stdin__ -- the original stdin; don't touch!
    __stdout__ -- the original stdout; don't touch!
    __stderr__ -- the original stderr; don't touch!
    __displayhook__ -- the original displayhook; don't touch!
    __excepthook__ -- the original excepthook; don't touch!
    
    Functions:
    
    displayhook() -- print an object to the screen, and save it in __builtin__._
    excepthook() -- print an exception and its traceback to sys.stderr
    exc_info() -- return thread-safe information about the current exception
    exc_clear() -- clear the exception state for the current thread
    exit() -- exit the interpreter by raising SystemExit
    getdlopenflags() -- returns flags to be used for dlopen() calls
    getprofile() -- get the global profiling function
    getrefcount() -- return the reference count for an object (plus one :-)
    getrecursionlimit() -- return the max recursion depth for the interpreter
    getsizeof() -- return the size of an object in bytes
    gettrace() -- get the global debug tracing function
    setcheckinterval() -- control how often the interpreter checks for events
    setdlopenflags() -- set the flags to be used for dlopen() calls
    setprofile() -- set the global profiling function
    setrecursionlimit() -- set the max recursion depth for the interpreter
    settrace() -- set the global debug tracing function

FUNCTIONS
    __displayhook__ = displayhook(...)
        displayhook(object) -> None
        
        Print an object to sys.stdout and also save it in __builtin__._
    
    __excepthook__ = excepthook(...)
        excepthook(exctype, value, traceback) -> None
        
        Handle an exception by displaying it with a traceback on sys.stderr.
    
    call_tracing(...)
        call_tracing(func, args) -> object
        
        Call func(*args), while tracing is enabled.  The tracing state is
        saved, and restored afterwards.  This is intended to be called from
        a debugger from a checkpoint, to recursively debug some other code.
    
    callstats(...)
        callstats() -> tuple of integers
        
        Return a tuple of function call statistics, if CALL_PROFILE was defined
        when Python was built.  Otherwise, return None.
        
        When enabled, this function returns detailed, implementation-specific
        details about the number of function calls executed. The return value is
        a 11-tuple where the entries in the tuple are counts of:
        0. all function calls
        1. calls to PyFunction_Type objects
        2. PyFunction calls that do not create an argument tuple
        3. PyFunction calls that do not create an argument tuple
           and bypass PyEval_EvalCodeEx()
        4. PyMethod calls
        5. PyMethod calls on bound methods
        6. PyType calls
        7. PyCFunction calls
        8. generator calls
        9. All other calls
        10. Number of stack pops performed by call_function()
    
    exc_clear(...)
        exc_clear() -> None
        
        Clear global information on the current exception.  Subsequent calls to
        exc_info() will return (None,None,None) until another exception is raised
        in the current thread or the execution stack returns to a frame where
        another exception is being handled.
    
    exc_info(...)
        exc_info() -> (type, value, traceback)
        
        Return information about the most recent exception caught by an except
        clause in the current stack frame or in an older stack frame.
    
    exit(...)
        exit([status])
        
        Exit the interpreter by raising SystemExit(status).
        If the status is omitted or None, it defaults to zero (i.e., success).
        If the status is numeric, it will be used as the system exit status.
        If it is another kind of object, it will be printed and the system
        exit status will be one (i.e., failure).
    
    getcheckinterval(...)
        getcheckinterval() -> current check interval; see setcheckinterval().
    
    getdefaultencoding(...)
        getdefaultencoding() -> string
        
        Return the current default string encoding used by the Unicode 
        implementation.
    
    getdlopenflags(...)
        getdlopenflags() -> int
        
        Return the current value of the flags that are used for dlopen calls.
        The flag constants are defined in the ctypes and DLFCN modules.
    
    getfilesystemencoding(...)
        getfilesystemencoding() -> string
        
        Return the encoding used to convert Unicode filenames in
        operating system filenames.
    
    getprofile(...)
        getprofile()
        
        Return the profiling function set with sys.setprofile.
        See the profiler chapter in the library manual.
    
    getrecursionlimit(...)
        getrecursionlimit()
        
        Return the current value of the recursion limit, the maximum depth
        of the Python interpreter stack.  This limit prevents infinite
        recursion from causing an overflow of the C stack and crashing Python.
    
    getrefcount(...)
        getrefcount(object) -> integer
        
        Return the reference count of object.  The count returned is generally
        one higher than you might expect, because it includes the (temporary)
        reference as an argument to getrefcount().
    
    getsizeof(...)
        getsizeof(object, default) -> int
        
        Return the size of object in bytes.
    
    gettrace(...)
        gettrace()
        
        Return the global debug tracing function set with sys.settrace.
        See the debugger chapter in the library manual.
    
    setcheckinterval(...)
        setcheckinterval(n)
        
        Tell the Python interpreter to check for asynchronous events every
        n instructions.  This also affects how often thread switches occur.
    
    setdlopenflags(...)
        setdlopenflags(n) -> None
        
        Set the flags used by the interpreter for dlopen calls, such as when the
        interpreter loads extension modules.  Among other things, this will enable
        a lazy resolving of symbols when importing a module, if called as
        sys.setdlopenflags(0).  To share symbols across extension modules, call as
        sys.setdlopenflags(ctypes.RTLD_GLOBAL).  Symbolic names for the flag modules
        can be either found in the ctypes module, or in the DLFCN module. If DLFCN
        is not available, it can be generated from /usr/include/dlfcn.h using the
        h2py script.
    
    setprofile(...)
        setprofile(function)
        
        Set the profiling function.  It will be called on each function call
        and return.  See the profiler chapter in the library manual.
    
    setrecursionlimit(...)
        setrecursionlimit(n)
        
        Set the maximum depth of the Python interpreter stack to n.  This
        limit prevents infinite recursion from causing an overflow of the C
        stack and crashing Python.  The highest possible limit is platform-
        dependent.
    
    settrace(...)
        settrace(function)
        
        Set the global debug tracing function.  It will be called on each
        function call.  See the debugger chapter in the library manual.

DATA
    __stderr__ = <open file '<stderr>', mode 'w'>
    __stdin__ = <open file '<stdin>', mode 'r'>
    __stdout__ = <open file '<stdout>', mode 'w'>
    api_version = 1013
    argv = ['-c', '-f', '/Users/jakevdp/.ipython/profile_default/security/...
    builtin_module_names = ('__builtin__', '__main__', '_ast', '_codecs', ...
    byteorder = 'little'
    copyright = 'Copyright (c) 2001-2013 Python Software Foundati...ematis...
    displayhook = <IPython.kernel.zmq.displayhook.ZMQShellDisplayHook obje...
    dont_write_bytecode = False
    exc_value = TypeError("<module 'sys' (built-in)> is a built-in module"...
    exec_prefix = '/Users/jakevdp/anaconda'
    executable = '/Users/jakevdp/anaconda/bin/python'
    flags = sys.flags(debug=0, py3k_warning=0, division_warn...unicode=0, ...
    float_info = sys.float_info(max=1.7976931348623157e+308, max_...epsilo...
    float_repr_style = 'short'
    hexversion = 34014704
    last_value = AttributeError("'module' object has no attribute 'mulitpl...
    long_info = sys.long_info(bits_per_digit=30, sizeof_digit=4)
    maxint = 9223372036854775807
    maxsize = 9223372036854775807
    maxunicode = 65535
    meta_path = []
    modules = {'ConfigParser': <module 'ConfigParser' from '/Users/jakevdp...
    path = ['', '/Users/jakevdp/Opensource/Bokeh', '/Users/jakevdp/anacond...
    path_hooks = [<type 'zipimport.zipimporter'>]
    path_importer_cache = {'': None, '/Users/jakevdp/.ipython/extensions':...
    platform = 'darwin'
    prefix = '/Users/jakevdp/anaconda'
    ps1 = 'In : '
    ps2 = '...: '
    ps3 = 'Out: '
    py3kwarning = False
    stderr = <IPython.kernel.zmq.iostream.OutStream object>
    stdin = <open file '<stdin>', mode 'r'>
    stdout = <IPython.kernel.zmq.iostream.OutStream object>
    subversion = ('CPython', '', '')
    version = '2.7.5 |Anaconda 1.4.0 (x86_64)| (default, Jun 28...3, 22:20...
    version_info = sys.version_info(major=2, minor=7, micro=5, releaseleve...
    warnoptions = []



In [74]:
import sys
import os

print "You are using Python version", sys.version
print 40 * '-'
print "Current working directory is:"
print os.getcwd()
print 40 * '-'
print "Files in the current directory:"
for f in os.listdir(os.getcwd()):
    print f


You are using Python version 2.7.5 |Anaconda 1.4.0 (x86_64)| (default, Jun 28 2013, 22:20:13) 
[GCC 4.0.1 (Apple Inc. build 5493)]
----------------------------------------
Current working directory is:
/Users/jakevdp/Opensource/2013_fall_ASTR599/notebooks
----------------------------------------
Files in the current directory:
.ipynb_checkpoints
00_intro.ipynb
01_basic_training.ipynb
02_advanced_data_structures.ipynb
03_IPython_intro.ipynb
04_Functions_and_modules.ipynb
data
Healpy.ipynb
images
Markdown Cells.ipynb
myfile.html
myfile.py
myfile.pyc
mymodule.py
mymodule.pyc
mymodule2.py
mymodule2.pyc
number_game.py
README.txt
style.css

Built-in modules are listed at http://docs.python.org/2/py-modindex.html

Python builtin modules are awesome...

(source: http://xkcd.com/353/)


In [ ]:
# try importing antigravity...

Making a script executable

When a script or module is run directly from the command-line (i.e. not imported) a special variable called __name__ is set to "__main__".

So, in your module, if you want some part of the code to only run when the script is executed directly, then you can make it look like this:

# all module stuff

# at the bottom, put this:
if __name__ == '__main__':
    # do some things
    print "I was called from the command-line!"

Here's a longer example of this in action:


In [4]:
%%file modfun.py

"""
Some functions written to demonstrate a bunch of concepts
like modules, import and command-line programming
"""
import os
import sys


def getinfo(path=".",show_version=True):
    """
    Purpose: make simple us of os and sys modules
    Input: path (default = "."), the directory you want to list
    """
    if show_version:
        print "-" * 40
        print "You are using Python version ",
        print sys.version
        print "-" * 40
    
    print "Files in the directory " + str(os.path.abspath(path)) + ":"
    for f in os.listdir(path):
        print "  " + f
    print "*" * 40
    
    
if __name__ == "__main__":
    """
    Executed only if run from the command line.
    call with
      modfun.py <dirname> <dirname> ...
    If no dirname is given then list the files in the current path
    """
    if len(sys.argv) == 1:
        getinfo(".",show_version=True)
    else:
        for i,dir in enumerate(sys.argv[1:]):
            if os.path.isdir(dir):
                # if we have a directory then operate on it
                # only show the version info
                # if it's the first directory
                getinfo(dir,show_version=(i==0))
            else:
                print "Directory: " + str(dir) + " does not exist."


Writing modfun.py

In [6]:
# now execute from the command-line
%run modfun.py


----------------------------------------
You are using Python version  2.7.5 |Anaconda 1.4.0 (x86_64)| (default, Jun 28 2013, 22:20:13) 
[GCC 4.0.1 (Apple Inc. build 5493)]
----------------------------------------
Files in the directory /Users/jakevdp/Opensource/2013_fall_ASTR599/notebooks:
  .ipynb_checkpoints
  00_intro.ipynb
  01_basic_training.ipynb
  02_advanced_data_structures.ipynb
  03_IPython_intro.ipynb
  04_Functions_and_modules.ipynb
  data
  Healpy.ipynb
  images
  Markdown Cells.ipynb
  modfun.py
  myfile.html
  myfile.py
  myfile.pyc
  mymodule.py
  mymodule.pyc
  mymodule2.py
  mymodule2.pyc
  number_game.py
  README.txt
  style.css
****************************************

Note some of the sys and os commands used in this script!

Breakout Session:

Exploring Built-in Modules

This breakout will give you a chance to explore some of the builtin modules offered by Python. For this session, please use your text editor to create the files. You'll have to

  1. Create and edit a new file called age.py. Though you can do this via the %%file magic used above, here you should use your text editor.

    • within age.py, import the datetime module
    • use datetime.datetime() to create a variable representing your birthday
    • use datetime.datetime.now() to create a variable representing the present date
    • subtract the two (this forms a datetime.timedelta() object) and print that variable.

    • Use this object to answer these questions:

      1. How many days have you been alive?

      2. How many hours have you been alive?

      3. What will be the date 1000 days from now?

  2. Create and edit a new file called age1.py. When run from the command-line with one argument, age1.py should print out the date in that many days from now. If run with three arguments, print the time in days since that date.

[~]$ python age1.py 1000
date in 1000 days 2016-06-06 14:46:09.548831

[~]$ python age1.py 1981 6 12
days since then: 11779