Chapter 2, example 2

Here we continue with the previous example: we write some Python commands to list all files in the downloaded data. We also show how to execute external Python scripts in IPython using the magic command %run.

The following code should be stored in egos.py:

import sys
import os
# we retrieve the folder as the first positional argument
# to the command-line call
if len(sys.argv) > 1:
    folder = sys.argv[1]
# we list all files in the specified folder
files = os.listdir(folder)
# ids contains the sorted list of all unique idenfitiers
ids = sorted(set(map(lambda file: int(file.split('.')[0]), files)))

The egos.py script accepts the facebook folder's relative path as an argument, and extracts all file identifiers in an ids variable. Each file in the dataset has the form ID.extension.


In [1]:
%run egos.py data/facebook

After using the %run command to execute this script, all variables defined in the script are available in the current user namespace. In particular, we can retrieve the ids variable here.


In [2]:
ids


Out[2]:
[0, 107, 348, 414, 686, 698, 1684, 1912, 3437, 3980]

Whereas output variables are available in the namespace after the script's execution, variables available in IPython before its execution are not available within the script by default. For example, here we define the folder variable in the current namespace, and we call %run egos.py without an argument.


In [3]:
folder = 'data/facebook'

A NameError exception is expected here, because the folder variable is not defined in the script, although it is defined in the current namespace.


In [4]:
%run egos.py


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
IPython\utils\py3compat.pyc in execfile(fname, glob, loc)
    169             else:
    170                 filename = fname
--> 171             exec compile(scripttext, filename, 'exec') in glob, loc
    172     else:
    173         def execfile(fname, *where):

chapter2\egos.py in <module>()
      6     folder = sys.argv[1]
      7 # we list all files in the specified folder
----> 8 files = os.listdir(folder)
      9 # ids contains the sorted list of all unique idenfitiers
     10 ids = sorted(set(map(lambda file: int(file.split('.')[0]), files)))

NameError: name 'folder' is not defined

However, we can tell IPython explicitely to use the current user namespace for the external script's execution with the -i option.


In [5]:
%run -i egos.py

Now the script's execution works as expected, and the ids variable is available.


In [6]:
ids


Out[6]:
[0, 107, 348, 414, 686, 698, 1684, 1912, 3437, 3980]