In [2]:
%matplotlib inline
In [3]:
%load_ext autoreload
In [4]:
%autoreload 2
import sys
import scipy
sys.path.append('../../../')
import lxmls
The CKY algorithm is a dynamic programming algorithm, much like Viterbi. The general idea is to build up on smaller, simpler things you have computed in previous iterations.
Viterby is focused on building sequences (of labels) by computing the highest scoring sequence of length $n$ from sequences of length $(n-1)$. CKY expands this idea to graphs, which can be thought as higher dimension sequences. In particular, trees spanning over $n$ words are built from trees spanning up to $(n-1)$ words.
In the next section, we're going to do some CKY iterations on a real sequence, to get a more concrete sense of how the algorithm works.
Exercise 4.1 In this simple exercise, you will see the CKY algorithm in action. There is a Javascript applet that illustrates how CKY works (in its non-probabilistic form). Go to http://lxmls.it.pt/2015/cky.html, and observe carefully the several steps taken by the algorithm
Exercise 4.2 This exercise will show you that real-world sentences can have complicated syntactic structures. There is a parse tree visualizer in http://brenocon.com/parseviz/. Go to your local data/treebanks folder and open the file PTB excerpt.txt. Copy a few trees from the file, one at a time, and examine their parse trees in the visualizer.
Exercise 4.3 In this exercise you are going to experiment with arc-factored non-projective dependency parsers. The CoNLL-X and CoNLL 2008 shared task datasets (Buchholz and Marsi, 2006; Surdeanu et al., 2008) contain dependency treebanks for 14 languages. In this lab, we are going to experiment with the Portuguese and English datasets.
In [7]:
# Exercises for lab day 4 Parsing.
import sys
sys.path.append('.')
import lxmls.parsing.dependency_parser as depp
import pdb
print "Exercise 4.3.1"
dp = depp.DependencyParser()
dp.read_data("portuguese")
print "Exercise 4.3.2"
dp.train_perceptron(10)
dp.test()
print "Exercise 4.3.3"
dp.features.use_lexical = True
dp.read_data("portuguese")
dp.train_perceptron(10)
dp.test()
dp.features.use_distance = True
dp.read_data("portuguese")
dp.train_perceptron(10)
dp.test()
dp.features.use_contextual = True
dp.read_data("portuguese")
dp.train_perceptron(10)
dp.test()
print "Exercise 4.3.4"
dp.train_crf_sgd(10, 0.01, 0.1)
dp.test()
print "Exercise 4.3.5"
dp.read_data("english")
dp.train_perceptron(10)
dp.test()
print "Exercise 4.3.6"
dp = depp.DependencyParser()
dp.features.use_lexical = True
dp.features.use_distance = True
dp.features.use_contextual = True
dp.read_data("english")
dp.projective = True
dp.train_perceptron(10)
dp.test()
Implement Eisner’s algorithm for projective dependency parsing. The pseudo-code is shown as Algorithm 13 Implement this algorithm as the function
In [5]:
def parse_proj(self, scores):
"""
Parse using Eisner's algorithm.
"""
############################
# Solution to Exercise 4.3.6
############################
nr, nc = np.shape(scores)
if nr != nc:
raise ValueError("scores must be a squared matrix with nw+1 rows")
return []
N = nr - 1 # Number of words (excluding root).
# Initialize CKY table.
complete = np.zeros([N+1, N+1, 2]) # s, t, direction (right=1).
incomplete = np.zeros([N+1, N+1, 2]) # s, t, direction (right=1).
complete_backtrack = -np.ones([N+1, N+1, 2], dtype=int) # s, t, direction (right=1).
incomplete_backtrack = -np.ones([N+1, N+1, 2], dtype=int) # s, t, direction (right=1).
incomplete[0, :, 0] -= np.inf
# Loop from smaller items to larger items.
for k in xrange(1, N+1):
for s in xrange(N-k+1):
t = s + k
# First, create incomplete items.
# left tree
incomplete_vals0 = complete[s, s:t, 1] + complete[(s+1):(t+1), t, 0] + scores[t, s]
incomplete[s, t, 0] = np.max(incomplete_vals0)
incomplete_backtrack[s, t, 0] = s + np.argmax(incomplete_vals0)
# right tree
incomplete_vals1 = complete[s, s:t, 1] + complete[(s+1):(t+1), t, 0] + scores[s, t]
incomplete[s, t, 1] = np.max(incomplete_vals1)
incomplete_backtrack[s, t, 1] = s + np.argmax(incomplete_vals1)
# Second, create complete items.
# left tree
complete_vals0 = complete[s, s:t, 0] + incomplete[s:t, t, 0]
complete[s, t, 0] = np.max(complete_vals0)
complete_backtrack[s, t, 0] = s + np.argmax(complete_vals0)
# right tree
complete_vals1 = incomplete[s, (s+1):(t+1), 1] + complete[(s+1):(t+1), t, 1]
complete[s, t, 1] = np.max(complete_vals1)
complete_backtrack[s, t, 1] = s + 1 + np.argmax(complete_vals1)
value = complete[0][N][1]
heads = -np.ones(N + 1, dtype=int)
self.backtrack_eisner(incomplete_backtrack, complete_backtrack, 0, N, 1, 1, heads)
value_proj = 0.0
for m in xrange(1, N+1):
h = heads[m]
value_proj += scores[h, m]
return heads
###################################
# End of solution to Exercise 4.3.6
###################################