Python datastructures notes


This contains operations on lists, dictionaries and tuples

Also covers

  • Augumented assignment trick
  • Map, filter and lambda
  • Iterators
  • Generators

List operations

List Copies are shallow


In [30]:
a = [1,2,3,4]
b = a
a[2] = 44 # b list also changes here

In [31]:
b


Out[31]:
[1, 2, 44, 4]

In [32]:
a is b # This shows a and b references are same


Out[32]:
True

Copying List techniques

There are three ways to copy a list


In [89]:
a = [1,2,3]
b = a[:] # list slicing technique
a is b


Out[89]:
False

In [34]:
b = a.copy() # using list copy method
a is b


Out[34]:
False

In [35]:
b = list(a) # using list constructor method
a is b


Out[35]:
False

Drawbacks of List copy methods


In [36]:
# List copy methods fail with nested lists
a = [[1,2],[3,4]]
# lets copy this list using any of the list copy methods
b = a.copy()
a is b


Out[36]:
False

In [37]:
# But...
a[0] is b[0] # So the references inside nested list remains same


Out[37]:
True

In [38]:
a[0].append(8) # this will change the values of b[0] as well!
print(a)
print(b)


[[1, 2, 8], [3, 4]]
[[1, 2, 8], [3, 4]]

Deep copy!


In [39]:
a = [[1,2],[4,5]]
import copy
b = copy.deepcopy(a) # Deep copy happens
a[0] is b[0]


Out[39]:
False

List repetitions


In [40]:
a = [0]*9
a


Out[40]:
[0, 0, 0, 0, 0, 0, 0, 0, 0]

In [41]:
# Beware List Repetitions are shallow!
# Example
a = [[-1,+1]]*5
a


Out[41]:
[[-1, 1], [-1, 1], [-1, 1], [-1, 1], [-1, 1]]

In [42]:
a[0].append(8)
a


Out[42]:
[[-1, 1, 8], [-1, 1, 8], [-1, 1, 8], [-1, 1, 8], [-1, 1, 8]]

List operations


In [43]:
a = [1,2,3,4,'fox',3]
i = a.index('fox')
print('index is {}'.format(i))
print('3 was repeated {} times in list a'.format(a.count(3)))


index is 4
3 was repeated 2 times in list a

In [44]:
# Membership of variable is checked using in and not in keywords
print(3 in a)
print(9 in a)
print(10 not in a)


True
False
True

Removing elements in List


In [45]:
a = [1,2,3,4,5,5,6,7,8,8]
del a[2] # Removing with del keyword

In [46]:
a


Out[46]:
[1, 2, 4, 5, 5, 6, 7, 8, 8]

In [47]:
a.remove(4)

In [48]:
a


Out[48]:
[1, 2, 5, 5, 6, 7, 8, 8]

In [49]:
a.remove(8)

In [50]:
a


Out[50]:
[1, 2, 5, 5, 6, 7, 8]

List Insertions


In [51]:
a = ['a','b','c','d']
a.insert(1,'f')
a


Out[51]:
['a', 'f', 'b', 'c', 'd']

In [81]:
statement = "I really love to code in python".split()
statement


Out[81]:
['I', 'really', 'love', 'to', 'code', 'in', 'python']

In [80]:
# Convert a list to string
' '.join(statement)


Out[80]:
'I really love to code in python'

Concatenate lists


In [54]:
m = [2,3,4]
n = [5,6,7]
m + n # add using +


Out[54]:
[2, 3, 4, 5, 6, 7]

In [55]:
m += [14,15,16]
m


Out[55]:
[2, 3, 4, 14, 15, 16]

In [56]:
m.extend(n)
m


Out[56]:
[2, 3, 4, 14, 15, 16, 5, 6, 7]

Reversing and sorting list


In [82]:
g = [4,6,2,7,8,21,9,1,10]
g.reverse()
g


Out[82]:
[10, 1, 9, 21, 8, 7, 2, 6, 4]

In [86]:
d = [2,3,5,67,1,3,91] # Sort from lowest to highest
d.sort()
d


Out[86]:
[1, 2, 3, 3, 5, 67, 91]

In [85]:
d.sort(reverse=True) # Sort from highest to lowest
d


Out[85]:
[91, 67, 5, 3, 3, 2, 1]

Remember above methods 'sort' and 'reverse' methods work directly on the original list; So we have to use sorted() and reversed() methods to ensure original list remains unmodified, these methods give a iterator to iterate on the sorted/reversed list


In [61]:
a = [1,2,3,4]
b = reversed(a)
print(list(b))
print(a)


[4, 3, 2, 1]
[1, 2, 3, 4]

In [62]:
a = [5,4,3,2,1]
list(sorted(a))


Out[62]:
[1, 2, 3, 4, 5]

In [63]:
a


Out[63]:
[5, 4, 3, 2, 1]

Shuffle a List


In [87]:
from random import shuffle
shuffle(a) # CAUTION: This will modify the original list
a


Out[87]:
[4, 1, 3, 5, 2]

Randomly pick some element from a List


In [65]:
from random import choice
choice(a) # This throws a random number from List


Out[65]:
3

Using List as a stack


In [66]:
stack = [1,2,3,4,5,6,7]
stack.append(8) # Push to a stack
stack


Out[66]:
[1, 2, 3, 4, 5, 6, 7, 8]

In [67]:
stack.pop() # Pops the last element


Out[67]:
8

In [68]:
stack


Out[68]:
[1, 2, 3, 4, 5, 6, 7]

Using Lists as Queue


In [69]:
from collections import deque
queue = deque(["Eric", "John", "Michael"])

In [71]:
queue


Out[71]:
deque(['Eric', 'John', 'Michael'])

In [72]:
queue.append('Max')

In [73]:
queue


Out[73]:
deque(['Eric', 'John', 'Michael', 'Max'])

In [74]:
queue.append("Albert")
queue


Out[74]:
deque(['Eric', 'John', 'Michael', 'Max', 'Albert'])

In [75]:
queue.reverse()

In [76]:
queue


Out[76]:
deque(['Albert', 'Max', 'Michael', 'John', 'Eric'])

In [77]:
queue.rotate(1)

In [78]:
queue


Out[78]:
deque(['Eric', 'Albert', 'Max', 'Michael', 'John'])

Dictionaries

  • TBD

A small introduction to map()

Map is a builtin function where a list of arguments can be sent to a function and it returns a iterator object!


In [ ]:
def square(x):
    return x*x

# SYNTAX: map(function, List of arguments)
list_squares = map(square, [1,2,3,4,5,6])
list(list_squares)

In [ ]:
for number in list_squares:
    print(number, end= ' ')

A small introduction to filter()


In [ ]:
def generate_odd_numbers(x):
    return x % 2 != 0

list(filter(generate_odd_numbers, range(10))) # Filter returns values which satisfy the confition,
# in simple terms - TRUE ONLY !

builtin iter() function


In [ ]:
# Lets discuss about builtin function iter()

Get an iterator from an object. In the first form, the argument must supply its own iterator, or be a sequence. we can call iter on any iterable object. Iterators give a huge performance boost for large data sets when loaded into memory. refer this link http://markmail.org/message/t2a6tp33n5lddzvy for more understanding.

  • below examples prints various kind of iterators -> string, list, tuple, dictionary and set

In [ ]:
# Lets call an iterator on List
numbers = [1,2,3,4,5]
num_iter = iter(numbers)
num_iter # returns a list iterator

In [ ]:
country = "India"
str_iter = iter(country)
str_iter

In [ ]:
tuple_numbers = (1,2,3,4,5)
t_iter = iter(tuple_numbers)
t_iter

In [ ]:
sample_dict = {'a':1,'b':2,'c':3,'d':4}
d_iter = iter(sample_dict) 
d_iter # remember iter on dictionary gives you all the keys when expanded

In [ ]:
sample_set = {1,2,3,4}
s_iter = iter(sample_set)
s_iter

How to expand an iterator object ?

any iterator object will have __iter__ and __next__ dunder method.

  • next() method should be called on iterator
  • can be used in for loop
  • can be used in while loop with an exception handled (StopIteration)

In [ ]:
next(num_iter)

In [ ]:
next(num_iter)

In [ ]:
next(num_iter)

In [ ]:
next(num_iter)

In [ ]:
next(num_iter)

In [ ]:
next(num_iter)

In [ ]:
# iterating over a dictionary with for loop
for num in t_iter:
    print(num, end = ' ')

In [ ]:
# Iterating over a dictionary using while loop
while True:
    try:
        key = next(d_iter)
        print(sample_dict[key])
    except StopIteration:
        print("Iterator ended!")
        break

iter with a senital argument example


In [ ]:
# senital is an argument in iterators, we can use this instead of StopIteration exception [ ** Better notes needed here]
# lets check this with a file I/O example
fp = open('sample.txt')
fp

In [ ]:
fp_iter = iter(fp.readline, 'STOP\n') # here the second argument is when "STOP" comes ensure iterator is out of loop
fp_iter

In [ ]:
list(fp_iter) # only readlines till STOP word is encountered

In my view iter(func, sentinal) => senital value should be used only when we are sure of what we are trying to achieve. or better to try the function on REPL first and then put this in production code. if we see above example - the same can be achieved by writing if fp.readline() == "STOP": break , but as iter gives a performance boost so can be used here - but if readability is your first choice - then dont use senital value.

AUGMENTED ASSIGNMENT TRICK !


In [ ]:
s1 = s2 = '123'
s1 is s2, s1, s2

In [ ]:
s2 = s2 + '4'
s1 is s2, s1, s2

In [ ]:
m1 = m2 = [1,2,3]
m1 is m2, m1, m2

In [ ]:
m2 = m2 + [4]
m1 is m2, m1,m2

In [ ]:
s1 = s2 = '123'
s1 is s2, s1, s2

In [ ]:
s2 += '4'
s1 is s2, s1, s2

In [ ]:
m1 = m2 = [1,2,3]
m1 is m2, m1, m2

In [ ]:
m2 += [4]
m1 is m2, m1,m2

+= and a = a + 1 are not same; they are not syntax equivalent alternatives in python. they have their own behaviours with different datatypes; specifically with strings and lists as shown above

Lets look at byte code to confirm this. We can see BINARY_ADD and INPLACE_ADD for different operations


In [ ]:
import codeop, dis

In [ ]:
dis.dis(codeop.compile_command("a = a+b"))

In [ ]:
dis.dis(codeop.compile_command("a += b"))

Lets watch the same at higher level - find out why it is different for string and list !


In [ ]:
m2 = [1,2,3]
m2

In [ ]:
m2.__iadd__([4])

In [ ]:
s2 = "1234"
s2.__iadd__('5')

A similar behaviour with tuples but more interesting !


In [ ]:
m1 = ([7],)

In [ ]:
m1[0]

In [ ]:
m1[0] += [8]

even though above code throws error - if we print m1, we can see 8 got appended! That is why we should not use augmented assignment should not be used as reference location gets changed.


In [ ]:
m1 # ERROR !

Scope !

There are two keywords which define the scope of variable in python they are *global* and *nonlocal* *global* keyword defines module level scope *nonlocal* keyword defines function/ method level scope

In [ ]:
a = 10
def method():
    # if we want to access 'a' declared outside the function, we have to use global
    a = 20
    print("Inside method 'a' is ", a)

method()
print(a)

In [ ]:
a = 10
def method():
    # if we want to access 'a' declared outside the function, we have to use global
    global a
    a = 20

method()
print(a)

In [ ]:
x = 0
def outer():
    x = 1
    def inner():
        x = 2
        print("inner:", x)

    inner()
    print("outer:", x)

outer()
print("global:", x)

In [ ]:
x = 0
def outer():
    x = 1
    def inner():
        nonlocal x
        print("inner:", x)

    inner()
    print("outer:", x)

outer()
print("global:", x)

In [ ]:
x = 0
def outer():
    x = 1
    def inner():
        global x
        x = 2
        print("inner:", x)

    inner()
    print("outer:", x)

outer()
print("global:", x)

Generators

Cons of handling iterators: iter() and next() method, keep track of internal states, raise StopIteration when there was no values to be returned etc.

This is both lengthy and counter intuitive. Generator comes into rescue in such situations.

Python generators are a simple way of creating iterators. All the overhead we mentioned above are automatically handled by generators in Python.

Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).

Differences between Generator function and a Normal function

Here is how a generator function differs from a normal function.

Generator function contains one or more yield statement.
When called, it returns an object (iterator) but does not start execution immediately.
Methods like __iter__() and __next__() are implemented automatically. So we can iterate through the items using next().
Once the function yields, the function is paused and the control is transferred to the caller.
Local variables and their states are remembered between successive calls.
Finally, when the function terminates, StopIteration is raised automatically on further calls.

In [ ]:
# A simple generator function
def my_gen():
    n = 1
    print('This is printed first')
    # Generator function contains yield statements
    yield n

    n += 1
    print('This is printed second')
    yield n

    n += 1
    print('This is printed at last')
    yield n

In [ ]:
# Using next()

In [ ]:
# Using for loop()

In [ ]:
def rev_str(my_str):
    length = len(my_str)
    for i in range(length - 1,-1,-1):
        yield my_str[i]

In [ ]:
# Demo using For loop

Basic Intro to List comprehension


In [ ]:
S = [x**2 for x in range(10)]
V = [2**i for i in range(13)]
M = [x for x in S if x % 2 == 0]

In [ ]:
noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
primes = [x for x in range(2, 50) if x not in noprimes]

In [ ]:
words = 'The quick brown fox jumps over the lazy dog'.split()
stuff = [[w.upper(), w.lower(), len(w)] for w in words]

In [ ]:
stuff = map(lambda w: [w.upper(), w.lower(), len(w)], words)

In [ ]:
stuff

In [ ]:
my_list = [1, 3, 6, 10]

a = (x**2 for x in my_list)
# Output: 1
print(next(a))

# Output: 9
print(next(a))

# Output: 36
print(next(a))

# Output: 100
print(next(a))

# Output: StopIteration
next(a)

Decorators


In [ ]:
def first(msg):
    print(msg)    

first("Hello")

second = first
second("Hello")

In [ ]:
def inc(x):
    return x + 1

def dec(x):
    return x - 1

def operate(func, x):
    result = func(x)
    return result

In [ ]:
operate(inc, 1)

In [ ]:
operate(dec, 3)

In [ ]:
def is_called():
    def is_returned():
        print("Hello")
    
    return is_returned

new = is_called()
new()

In [ ]:
def make_pretty(func):
    def inner():
        print("I got decorated")
        func()
    return inner

def ordinary():
    print("I am ordinary")

In [ ]:
ordinary()

In [ ]:
pretty = make_pretty(ordinary)
pretty()

In [ ]:
# Syntax for decaroators which does the same thing
@make_pretty
def ordinary():
    print("I am ordinary")

In [ ]:
ordinary()

In [ ]:
# Decorating functions with parameters
def smart_divide(func):
    def inner(a,b):
        print("I am going to divide",a,"and",b)
        if b == 0:
            print("Whoops! cannot divide")
            return
        return func(a,b)
    return inner

@smart_divide
def divide(a,b):
    return a/b

In [ ]:
divide(1,0)

In [ ]:
divide(2,3)

In [ ]:
# Universal decorator
def star(func):
    def inner(*args, **kwargs):
        print("*" * 30)
        func(*args, **kwargs)
        print("*" * 30)
    return inner

def percent(func):
    def inner(*args, **kwargs):
        print("%" * 30)
        func(*args, **kwargs)
        print("%" * 30)
    return inner

@star
@percent
def printer(msg):
    print(msg)
printer("Hello")

In [ ]: