**(C) 2016 by Damir Cavar <dcavar@iu.edu>**

**Version:** 1.0, September 2016

This is a tutorial about developing simple Part-of-Speech taggers using Python 3.x and the NLTK.

*iPhone NN*).

There are various smoothing techniques:

- Additive smoothing

*the cat* as the conditional probability $P(cat|the)$, for example, is the probability of the n-gram $P(the\ cat)$ divided by the probability of the token *the*, i.e. $P(the)$.

$$P(w_n\ |\ w_1\ \dots{}\ w_{n-1}) = \frac{P(w_1\ \dots{}\ w_n)}{P(w_1\ \dots{}\ w_{n-1})}$$

$$P(w_n\ |\ w_1\ \dots{}\ w_{n-1}) = \frac{P(w_1\ \dots{}\ w_n)}{P(w_1\ \dots{}\ w_{n-1})} = \frac{\frac{C(w_1\dots{}w_n)}{C(t)-(N-1)}}{\frac{C(w_1\dots{}w_{n-1})}{C(t)-(N-2)}}=\frac{C(w_1\dots{}w_n)}{C(t)-(N-1)} \frac{C(t)-(N-2)}{C(w_1\dots{}w_{n-1})}=\frac{C(w_1\dots{}w_n)}{C(w_1\dots{}w_{n-1})}$$

*The black cat is chasing the white mouse with the black tail.*

$$P=\frac{C+1}{N+V}$$

a