4.1 Route between Nodes. Given a directed graph, design an algorithm to find out whether there is a route between two nodes.

4.2 Minimal Tree. Given a sorted array with unique integer elements, write an algorithim to create a binary search tree with minimal height.

4.9 BST sequences. A binary tree was created by traversing through an array from left to right and inserting each element. Given a binary search tree with distinct elements, print all possible arrays that could have led to this tree.

4.10 Check subtree. t1 and t2 are two very large binary tres, with t1 much bigger than t2. Create an algorithm to determine if t2 is a subtree of t1. A tree t2 is a subtree of t1 if there exists a node n in t1 such that the subtree of n is identical to t2. That is, if you cut off the tree at node n, the two trees would be identical .

4.11 Random node. You are implementing a binary tree calss from scratch which, in addition to insert, find, and delete, has a method getRandomNode() which returns a random node from the tree. All nodes should be equally likely to be chosen. Design and implement an algorithim for getRandomNode, and explain how you would implement the rest of the methods.

4.12 Paths with sum. You are given a binary tree in which each node contains an integer value (which might be positive or negaitve). Design an algorithm to count the number of paths that sum to a given value. The path does not need to start or end at the root or a leaf, but it must go downwards ( traveling only from a parent nodes to child nodes).


In [15]:
'''
Depth-First Search

Here is a depth-fist search for an undirected graph that may be 
disconnected. It is writeen as an adjacency matrix.
'''
graph = {
    'a': ['b'],
    'b': ['']
}

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, x):
#         self.val = x
#         self.left = None
#         self.right = None

def dfs(self, node):
    visited = []
    def dfs(self, v):
        visited = [False]*(len(self.graph))
        print(self.graph)
        self.dfsUtil(v, visited)


defaultdict(<class 'list'>, {0: [1, 2], 1: [2], 2: [0, 3], 3: [3]})
2
0
1
3

Build Binary Search Tree from Sorted Array


In [3]:
class Node:
    def __init__(self, value):
        self.right = None
        self.left = None
        self.val = value
        
    def __str__(self):
        return '('+str(self.left)+':L ' + "V:" + str(self.val) + " R:" + str(self.right)+')'

def buildBinaryTree(nodes):
    if len(nodes) == 0:
        raise ValueError('list is empty')
    return binaryTree(nodes, 0, len(nodes)-1)
        
def binaryTree(nodes, start, end):
    if start > end:
        return ''
    middle = (start + end) // 2
    root = Node(nodes[middle])
    root.left = binaryTree(nodes, start, middle -1)
    root.right = binaryTree(nodes, middle+1, end)
    return root
        
    



test1 = [1, 2, 3, 4, 5, 6, 7, 8]
test2 = [-1, 0, 9, 10]
#test3 = []
test4 = [0, 1, 2, 3, 3, 3, 5]
print(buildBinaryTree(test1))


(((:L V:1 R:):L V:2 R:(:L V:3 R:)):L V:4 R:((:L V:5 R:):L V:6 R:(:L V:7 R:(:L V:8 R:))))

4.4 Check balanced.

Implement a function to check if a binary tree is balanced. For the purpose of this quesiton, a balanced tree is defined to be a tree such that the heights of the two subtress of any node never differ by more than one.


In [ ]:
class Node:
    def __init(self)__:
        self.value = value
        self.left = None
        self.right = None

def checkBalanced(root):
    if root is None:
        return 0
    
    left = checkBalanced(root.left)
    right = checkBalanced(root.right) 
    
    if abs(left - right) > 1:
        return -1
    
    return max(left, right) + 1

4.5 Validate BST.

Implement a function to check if a binary tree is a binary search tree.

Solution1: In-order traversal, cannot handle duplicate values

Solution2: Pass down min/max values that each node must fall between


In [ ]:
class Node:
    def __init__(self, value):
        self.v = value
        self.right = None
        self.left= None

def checkBST(node):
    return (checkBSThelper(node, -math.inf, math.inf))

def checkBSThelper(node, mini, maxi):
    if node is None:
        return True
    if node.v < mini or node.v >= maxi:
        return False
    return checkBSThelper(node.left, mini, node.v) and checkBSThelper(node.right, node.v, maxi)

4.6 Successor.

Write an algorithm to find the 'next' node of a given node in a binary search tree. You may assume that each node has a link to its parent.

The inorder successor can be defined as the node with the smallest key greater than the key of input node.

  1. If right subtree of node is not null, then succ lies in the right subtree, get the minimum value in the right subtree.

  2. If right subtree is null, then succ is one of the ancestors. Travel up parent point until you see a node which is left child of its parent. The parent of node is succ.

This works if all nodes have a parent pointer.

If no parent pointer

  1. If right subtree of node is not null, then succ lies in the right subtree. Get minimum value in the right subtree
  2. If the right subtree is null, then start from root and search. Travel down the tree, if a node's data is greater than root's data then go right, otherwise go left.

In [12]:
class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None
    
def inOrderSucc(root, n):
    if n.right is not None:
        #get minimum value in right subtree
        return minValue(n.right)
    
    p = n.parent
    #succ is parent of leftchild to parent
    while(p is not None):
        if n != p.right:
            break
        n = p
        p = p.parent
    return p 

def minValue(node):
    current = node
    
    while(current is not None):
        if current.left is None:
            break
        current = current.left
    return current

4.7 Build order.

You are given a list of projects and a list of dependencies. All of a project's dependencies must be built before the project is. Find a build order that will allow the projects to be built. If there is no valid build order, return an error.

This is a tolopological sort problem, it can also be approached as a depth first search problem but the most important aspect is that we deal with cycles, when using a DFS approach the most important part is that we have a 'visiting' label so that if we come back to a node and it is labeled as 'visiting' then we have come across a cycle.


In [33]:
from collections import defaultdict

class Graph:
    def __init__(self, vertices):
        self.graph = defaultdict(list)
        self.v = vertices
        
    def addEdge(self, u, v):
        self.graph[u].append(v)
        
    def topologicalSortUtil(self, v, visited, stack):
        visited[v] = True
    
        for i in self.graph[v]:
            if visited[i] == False:
                self.topologicalSortUtil(i, visited, stack)
            else:
                
                raise ValueError('cycle: %s' % i)
            
        stack.insert(0, v)
        
    def topologicalSort(self):
        visited = [False]*self.v
        stack = []
        print(self.graph)
        
        for i in range(self.v):
            if visited[i] == False:
                self.topologicalSortUtil(i, visited, stack)
       
        print(stack)
    
g= Graph(6)
g.addEdge(5, 2)
g.addEdge(5, 0)

g.addEdge(4, 3)
g.addEdge(4, 1)


g.topologicalSort()


defaultdict(<class 'list'>, {5: [2, 0], 4: [3, 1]})
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-33-bfa5eff8cf10> in <module>()
     40 
     41 
---> 42 g.topologicalSort()

<ipython-input-33-bfa5eff8cf10> in topologicalSort(self)
     28         for i in range(self.v):
     29             if visited[i] == False:
---> 30                 self.topologicalSortUtil(i, visited, stack)
     31 
     32         print(stack)

<ipython-input-33-bfa5eff8cf10> in topologicalSortUtil(self, v, visited, stack)
     17             else:
     18 
---> 19                 raise ValueError('cycle: %s' % i)
     20 
     21         stack.insert(0, v)

ValueError: cycle: 3

4.8 First common ancestor.

Design an algorithm and write code to find the first common ancestor of two nodes in a binary tree. Avoid storing additional nodes in a data structure.

There are several variations on this problem. In a binary search tree we can begin at root and we know from the values of our two nodes that the lca, or lowest common ancestor is a number between node 1 and node 2. We can use this to make decisions where we should go.

In a binary tree we do not have any information abou tthe order, so all we can do is follow each node up to the root, and see where the two nodes share a path.

BFS and DFS

A few things that matter about this implementation of breath first search is the use of set to check for vertices in visited and not_visited. To create a set with user defined objects, not natively hashable, implement a version of hash and eq.

A way of approaching the dfs and bfs difference is that dfs can be represented as storing nodes in a stack and then traversing and bfs is like storing nodes in a queue and then traversing.


In [4]:
from collections import deque

class Node:
    def __init__(self, val):
        self.val = val
        self.edges = []

    def __eq__(self, other):
        return self.val == other.val

    def __hash__(self):
        return self.val


class Graph:
    def __init__(self, nodes=[]):
        self.nodes = nodes

    def add_node(self, val):
        new_node = Node(val)
        self.nodes.append(new_node)

    def add_edge(self, node1, node2):
        node1.edges.append(node2)
        node2.edges.append(node1)

    def bfs(self):
        if not self.nodes:
            return []
        start = self.nodes[0]
        visited, queue, result = set([start]), deque([start]), []
        while queue:
            node = queue.popleft()
            result.append(node)
            for nd in node.edges:
                if nd not in visited:
                    queue.append(nd)
                    visited.add(nd)
        return result

    def dfs(self):
        if not self.nodes:
            return []
        start = self.nodes[0]
        visited, stack, result = set([start]), [start], []
        while stack:
            node = stack.pop()
            result.append(node)
            for nd in node.edges:
                if nd not in visited:
                    stack.append(nd)
                    visited.add(nd)
        return result


graph = Graph()
graph.add_node(5)
graph.add_node(3)
graph.add_node(8)
graph.add_node(1)
graph.add_node(9)
graph.add_node(2)
graph.add_node(10)

#            2
#           /
# 5 - 3 - 8 -  9 - 10
#  \    /
#     1

graph.add_edge(graph.nodes[0], graph.nodes[1])
graph.add_edge(graph.nodes[0], graph.nodes[3])
graph.add_edge(graph.nodes[1], graph.nodes[2])
graph.add_edge(graph.nodes[0], graph.nodes[1])
graph.add_edge(graph.nodes[2], graph.nodes[3])
graph.add_edge(graph.nodes[2], graph.nodes[5])
graph.add_edge(graph.nodes[2], graph.nodes[4])
graph.add_edge(graph.nodes[4], graph.nodes[6])


dfs_result = graph.dfs()
bfs_result = graph.bfs()
print("DFS")
for i in range(len(dfs_result)):
    print(dfs_result[i].val)
print("BFS")
for i in range(len(bfs_result)):
    print(bfs_result[i].val)


DFS
5
1
8
9
10
2
3
BFS
5
3
1
8
2
9
10

Invert a Binary Tree

Invert a binary tree from left to right. It should now be re-sorted to a descending order if traversed inorder.


In [ ]:
def invert_tree(self, root):
    if root is not None:
        temp = root.left
        root.left = self.invert_tree(root.right)
        root.right = self.inver_tree(temp)
        
        # because mutliple assignment 
        # root.left, root.right = invert_tree(roo.right),
        #                         invert_tree(root.left)

    return root

In [ ]:
'''
Complete: A binary tree in which every level of the tree if fully 
filled except for perhaps the last level
'''

In [ ]:
'''
Full: A binary tree in which every node has either zero or two 
children. No nodes have only one child.
'''

In [ ]:
'''
Check if two binary trees are identical
Runtime complexity: O(n)
Memory: O(h) # height of the tree
'''

def level_order_traversal(root):
  if root == None:
    return
  q = deque()
  q.append(root)

  while q:
    temp = q.popleft()
    print(str(temp.data) + ",")
    if temp.left != None:
      q.append(temp.left)
    if temp.right != None:
      q.append(temp.right)


def are_identical(root1, root2):
  if root1 == None and root2 == None:
    return True
  
  if root1 != None and root2 != None:
    return (root1.data == root2.data and
              are_identical(root1.left, root2.left) and
              are_identical(root1.right, root2.right))
  
  return False

In [ ]:
'''
Clone Directed Graph
Runtime: O(n)
Memory: O(n)
'''

class Node:
  def __init__(self, d):
    self.data = d
    self.neighbors = []

def clone_rec(root, nodes_completed):
  if root == None:
    return None

  pNew = Node(root.data)
  nodes_completed[root] = pNew

  for p in root.neighbors:
    x = nodes_completed.get(p)
    if x == None:
      pNew.neighbors += [clone_rec(p, nodes_completed)]
    else:
      pNew.neighbors += [x]
  return pNew

def clone(root):
  nodes_completed = {}
  return clone_rec(root, nodes_completed)


# this is un-directed graph i.e.
# if there is an edge from x to y
# that means there must be an edge from y to x
# and there is no edge from a node to itself
# hence there can maximim of (nodes * nodes - nodes) / 2 edgesin this graph
def create_test_graph_undirected(nodes_count, edges_count):
  vertices = []
  for i in xrange(0, nodes_count):
    vertices += [Node(i)]

  all_edges = []
  for i in xrange(0, nodes_count):
    for j in xrange(i + 1, nodes_count):
      all_edges.append([i, j])

  shuffle(all_edges)

  for i in xrange(0, min(edges_count, len(all_edges))):
    edge = all_edges[i]
    vertices[edge[0]].neighbors += [vertices[edge[1]]]
    vertices[edge[1]].neighbors += [vertices[edge[0]]]

  return vertices
  

def print_graph(vertices):
  for n in vertices:
    print str(n.data) + ": {",
    for t in n.neighbors:
      print str(t.data) + " ",
    print

def print_graph_rec(root, visited_nodes):
  if root == None or root in visited_nodes:
    return

  visited_nodes.add(root)

  print str(root.data) + ": {",
  for n in root.neighbors:
    print str(n.data) + " ",
  print"}"

  for n in root.neighbors:
    print_graph_rec(n, visited_nodes)

def print_graph(root):
  visited_nodes = set()
  print_graph_rec(root, visited_nodes)

def main():
  vertices = create_test_graph_undirected(7, 18)
  print_graph(vertices[0])
  cp = clone(vertices[0])
  print
  print "After copy."
  print_graph(cp)


main()

Dijkstra's Algorithm

Get the shortest path between source and any node in a weighted graph, called the shortest path tree. A shortest path tree is not the same as the MST

Dijkstra vs Prim: Dijkstra greedly adds the edge with the minimal weight from the current tree to a new node- calculating the entire edge distance from source to the new node. Prim calculates only the weight of the node from current tree to new node. What is 'minimal' for the two algorithms is different.

The result is two different trees where Dijkstra may construct a graph that has a higher total weight than the MST but a minimal weight for the paths from the root to the vertices.

Kruskal is different in that is solves a minimum spanning tree but chooses edges that may not form a tree- it merely avoids cycles and thus the partial solutions may not be connected.

Time Complexity: O((|E|+|V|))log(|V|)

Space Complexity: O(|V|), up to |v| vertices have to be stored.


In [11]:
import collections
import math
 
 
class Graph:
  def __init__(self):
      self.vertices = set()
 
      # makes the default value for all vertices an empty list
      self.edges = collections.defaultdict(list)
      self.weights = {}
 
  def add_vertex(self, value):
    self.vertices.add(value)
 
  def add_edge(self, from_vertex, to_vertex, distance):
    if from_vertex == to_vertex: pass  # no cycles allowed
    self.edges[from_vertex].append(to_vertex)
    self.weights[(from_vertex, to_vertex)] = distance
 
  def __str__(self):
    string = "Vertices: " + str(self.vertices) + "\n"
    string += "Edges: " + str(self.edges) + "\n"
    string += "Weights: " + str(self.weights)
    return string
 
def dijkstra(graph, start):
  # initializations
  S = set()
 
  # delta represents the length shortest distance paths from start -> v, for v in delta. 
  # We initialize it so that every vertex has a path of infinity (this line will break if you run python 2)
  delta = dict.fromkeys(list(graph.vertices), math.inf)
  previous = dict.fromkeys(list(graph.vertices), None)
 
  # then we set the path length of the start vertex to 0
  delta[start] = 0
 
  # while there exists a vertex v not in S
  while S != graph.vertices:
    # let v be the closest vertex that has not been visited...it will begin at 'start'
    v = min((set(delta.keys()) - S), key=delta.get)
 
    # for each neighbor of v not in S
    for neighbor in set(graph.edges[v]) - S:
      new_path = delta[v] + graph.weights[v,neighbor]
 
      # is the new path from neighbor through 
      if new_path < delta[neighbor]:
        # since it's optimal, update the shortest path for neighbor
        delta[neighbor] = new_path
 
        # set the previous vertex of neighbor to v
        previous[neighbor] = v
    S.add(v)

  return (delta, previous)
 
 
 
def shortest_path(graph, start, end):
  delta, previous = dijkstra(graph, start)
  
  path = []
  vertex = end
 
  while vertex is not None:
    path.append(vertex)
    vertex = previous[vertex]
 
  path.reverse()
  return path

G = Graph()
G.add_vertex('a')
G.add_vertex('b')
G.add_vertex('c')
G.add_vertex('d')
G.add_vertex('e')
 
G.add_edge('a', 'b', 2)
G.add_edge('a', 'c', 8)
G.add_edge('a', 'd', 5)
G.add_edge('b', 'c', 1)
G.add_edge('c', 'e', 3)
G.add_edge('d', 'e', 4)
 
print(G) 
 
print(dijkstra(G, 'a'))
print(shortest_path(G, 'a', 'e'))


Vertices: {'d', 'a', 'b', 'c', 'e'}
Edges: defaultdict(<class 'list'>, {'d': ['e'], 'a': ['b', 'c', 'd'], 'b': ['c'], 'c': ['e']})
Weights: {('a', 'd'): 5, ('d', 'e'): 4, ('a', 'c'): 8, ('a', 'b'): 2, ('b', 'c'): 1, ('c', 'e'): 3}
({'d': 5, 'a': 0, 'b': 2, 'c': 3, 'e': 6}, {'d': 'a', 'a': None, 'b': 'a', 'c': 'b', 'e': 'c'})
['a', 'b', 'c', 'e']

Prim's Algorithm

Prim's algorithm works on undirected graphs only, where dijkstra's algorithm works for both directed and undirected.

There might be multiple minimum spanning trees for a graph and not all edges are necessarily used.

  1. Initialize all values with infinite and begin at the root with a value of zero
  2. Look at all the edges connected to the nodes currently in our path that go to vertices we havent' yet been to. Choose the lowest weight edge.
  3. In the case of a tie, pick one at random

In [23]:
import math

class Graph:
    def __init__(self, vertices):
        self.v = vertices
        self.graph = [[0 for column in range(vertices)]
                        for row in range(vertices)]
        
    def printMST(self, parent):
        for i in range(1, self.v):
            print(parent[i], "-", i, "\t", self.graph[i][parent[i]])
    
    def minKey(self, key, mstSet):
        minimum = math.inf
        
        for v in range(self.v):
            print(key)
            print(v)
            print(mstSet[v])
            if key[v] < minimum and mstSet[v] == False:
                minimum = key[v]
                min_index = v
        
        return min_index
   
    def primMST(self):
        
        key = [math.inf] * self.v #I think this should be frontier
        parent = [None] * self.v
        key[0] = 0
        mstSet = [False] * self.v
        
        parent[0] = -1
        
        for neighbors in range(self.v):
            minimum = (self.minKey(key, mstSet))
            
            mstSet[minimum] = True
            
            for u in range(self.v):
                if self.graph[u][v] > 0 and mstSet[v] == False and key[v] > self.graph[u][v]:
                        parent[v] = u
        self.printMST(parent)
    
    
    
    
g = Graph(5)
g.graph = [[0, 2, 0, 6, 0],
       [2, 0, 3, 8, 5],
       [0, 3, 0, 0, 7],
       [6, 8, 0, 0, 9],
       [0, 5, 7, 9, 0],
       ]

g.primMST()


[0, inf, inf, inf, inf]
0
False
[0, inf, inf, inf, inf]
1
False
[0, inf, inf, inf, inf]
2
False
[0, inf, inf, inf, inf]
3
False
[0, inf, inf, inf, inf]
4
False
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-23-2319ea83fe53> in <module>()
     54        ]
     55 
---> 56 g.primMST()

<ipython-input-23-2319ea83fe53> in primMST(self)
     39 
     40             for u in range(self.v):
---> 41                 if self.graph[u][v] > 0 and mstSet[v] == False and key[v] > self.graph[u][v]:
     42                         parent[v] = u
     43         self.printMST(parent)

IndexError: list index out of range

Bellman- Ford

Bellman - Ford algorithm. Bellman is also the creator of dynamic programming. The Bellman-Ford algorithim finds the shortest path on a weighted directed graph.

A way to think of Bellman-Ford is that the first run is a depth first search to do a topological sort followed by one round of Bellman-Ford where we find the shortest paths. The first round is about finding any path and discovering the nodes and the subsequent rounds are about finding short-cuts between the nodes.

Here's another bellman-ford: https://gist.github.com/joninvski/701720


In [ ]:
class Graph:
    def __init__(self, vertices):
        self.V= vertices
        self.graph= []
        
def addEdge(self,u,v,w):
        self.graph.append([u, v, w])
         
    # utility function used to print the solution
    def printArr(self, dist):
        print("Vertex   Distance from Source")
        for i in range(self.V):
            print("%d \t\t %d" % (i, dist[i]))
     
    # The main function that finds shortest distances from src to
    # all other vertices using Bellman-Ford algorithm.  The function
    # also detects negative weight cycle
    def BellmanFord(self, src):
 
        # Step 1: Initialize distances from src to all other vertices
        # as INFINITE
        dist = [float("Inf")] * self.V
        dist[src] = 0
 
 
        # Step 2: Relax all edges |V| - 1 times. A simple shortest 
        # path from src to any other vertex can have at-most |V| - 1 
        # edges
        for i in range(self.V - 1):
            # Update dist value and parent index of the adjacent vertices of
            # the picked vertex. Consider only those vertices which are still in
            # queue
            for u, v, w in self.graph:
                if dist[u] != float("Inf") and dist[u] + w < dist[v]:
                        dist[v] = dist[u] + w
 
        # Step 3: check for negative-weight cycles.  The above step 
        # guarantees shortest distances if graph doesn't contain 
        # negative weight cycle.  If we get a shorter path, then there
        # is a cycle.
 
        for u, v, w in self.graph:
                if dist[u] != float("Inf") and dist[u] + w < dist[v]:
                        print "Graph contains negative weight cycle"
                        return
                         
        # print all distance
        self.printArr(dist)
 
g = Graph(5)
g.addEdge(0, 1, -1)
g.addEdge(0, 2, 4)
g.addEdge(1, 2, 3)
g.addEdge(1, 3, 2)
g.addEdge(1, 4, 2)
g.addEdge(3, 2, 5)
g.addEdge(3, 1, 1)
g.addEdge(4, 3, -3)
 
#Print the solution
g.BellmanFord(0)