We've spoken a lot about data structures and orders of execution (loops, functions, and so on). But now that we're intimately familiar with different ways of blocking our code, we haven't yet touched on how this affects the variables we define, and where it's legal to use them. By the end of this lecture, you should be able to:
(couldn't resist)
Scope refers to where a variable is defined. Another way to look at scope is to ask about the lifetime of a variable.
Hopefully, it doesn't come as a surprise that some variables aren't always accessible everywhere in your program.
In [1]:
def func(x):
print(x)
x = 10
func(20)
print(x)
An example we've already encountered is when we're trying to handle an exception.
In [ ]:
import numpy as np
try:
i = np.random.randint(100)
if i % 2 == 0:
raise
except:
copy = i
print(i) # Does this work?
print(copy) # What about this?
There are different categories of scope. It's always helpful to know which of these categories a variable falls into.
A variable in global scope can be "seen" and accessed from pretty much anywhere. It's defining characteristic is that it's not created in any particular function or block of any kind. This lack of context makes it global.
In [ ]:
# This is a global variable. It can be accessed anywhere in this notebook.
a = 0
(Small caveat: there is the concept of "built-in" scope, such as range
or len
or SyntaxError
, which are technically even more "global" than global variables, since they're seen anywhere in Python writ large. "global" in this context means "seen anywhere in your program")
The next step down: these are variables defined within a specific context, such as inside a function, and no longer exist once the function or context ends.
In [ ]:
# This is our global variable, redefined.
a = 0
def f():
# This is a local variable. It disappears when the function ends.
b = 0
print(a) # a still exists here; b does not.
(Small caveat: there is the concept of "nonlocal" scope, where you have variables defined inside functions, when those functions are themselves defined inside functions. This gets into functional programming, which Python does support and is gaining momentum in data science, but which is beyond the scope (ha!) of this course)
This brings us to the overarching concept of a namespace.
A namespace is a collection, or pool, of variables in Python. The global namespace is the pool of global variables that exist in a program.
In [ ]:
a = 0
b = 0
def func():
c = 0
d = 0
a
and b
exist in the global namespace. c
and d
exist in the function namespace of the function func
.
The whole point of namespaces is, essentially, to keep a conceptual grip on the program you're writing.
Anyone using the Rodeo IDE?
Likewise, every function will also have its own namespace of variables. As will every class (which we'll get next week!).
What happens when namespaces collide?
In [ ]:
a = 0
def func():
a = 1
print(a) # What gets printed?
This effect is referred to as variable shadowing: the locally-scoped variable takes precedence over the globally-scoped variable. It shadows the global variable.
This is not a bug--in the name of program simplicity, this limits the scope of the effects of changing a variable's value to a single function, rather than your entire program!
If you have multiple functions that all use similar variable-naming conventions--or, even more likely, you have a program that's written by lots of different people who like to use the variable i
in everything--it'd be catastrophic if one change to a variable i
resulted in a change to every variable i
.
In [ ]:
i = 0
In [ ]:
def func1():
i = 10
In [ ]:
def func2():
i = 20
In [ ]:
def func3(i):
i = 40
In [ ]:
# ...
In [ ]:
def funcOneHundredBillion():
i = 938948292
print(i) # Wait, what is i?
If, however, you really want a global variable to be accessed locally--to disable the shadowing that is inherent in Python--you can use the global
keyword to tell Python that, yes, this is indeed a global variable.
In [8]:
i = 10
def func():
global i
i = 20
func()
print(i)
This is a separate section for any Java/C/C++ converts in the room.
We've seen how Python creates namespaces at different hierarchies--one for every function, one for each class, and one single global namespace--which holds variables that are defined.
But what about variables defined inside blocks--constructs like for
loops and if
statements and try
/except
blocks?
Let's take a look at an example.
In [ ]:
a = 0
if a == 0:
b = 1
In what namespace is b
?
Global. It's no different from a
.
How about this one:
In [ ]:
i = 42
for i in range(10):
i = i * 2
j = i
What is j
at the end?
18 (the last value of i
in the range--9--times two). Seeing a pattern yet?
Let's go back to the very first example in the lecture.
In [ ]:
import numpy as np
try:
i = np.random.randint(100)
if i % 2 == 0:
raise
except:
print(i) # What is i?
print(i) # What is i?
What is i
in these cases? Is there a case where i
does not exist?
Nope, i
is in the global namespace.
The whole point is to illustrate that blocks in Python--conditionals, loops, exception handlers--all exist in their same enclosing scope and do NOT define new namespaces.
This is somewhat of a departure from Java, where you could define an int
counter inside a loop, but it would disappear once the loop ended, so you'd have to define the counter outside the loop in order to use it afterwards.
To illustrate this idea of a namespace being confined to functions, classes, and the global namespace, here's a bunch of nested conditionals that ultimately define a variable:
In [9]:
a = 1
if a % 2 == 1:
if 2 - 1 == a:
if a * 1 == 1:
if a / 1 == 1:
for i in range(10):
for j in range(10):
b = i * j
print(b)
b
is a global variable. So it makes sense that it's accessible anywhere, whether in the print
statement or in the nested conditionals. But there's a caveat here--anyone know what it is?
What if one of the conditionals fails?
Here's the same code again, but I've simply changed the starting value of a
.
In [ ]:
#a = 1
a = 0
if a % 2 == 1:
if 2 - 1 == a:
if a * 1 == 1:
if a / 1 == 1:
for i in range(10):
for j in range(10):
b = i * j
print(b)
The first condition should fail; now that a == 0
, a modulo by 2 will give a remainder of 0, thus terminating the conditionals at the very first one and skipping straight to the print
statement. What happens?
CRASH.
The moral of the story here is: namespaces are great, but you still have to define your variables.
1: Are function arguments in the global or local function namespace? Are there any circumstances under which this would not be the case?
2: Give some examples of cases where global variables are helpful.
3: Give some examples where global variables can be a liability.
4: Let's say I call a function that takes 1 argument: a variable named index
. Later on in that function, I write a for
loop with the header for index in range(10):
. I know a little about variable scoping, so I'm confident that shadowing will preserve the original value of the index
function argument once the for
loop finishes running. Is this thinking accurate? Why or why not?
5: Can you think of any examples where the "built-in" namespace is different from the "global" namespace?