input word is: $c \in \mathbb{R}^C$.
embedding, : $\mathbf{U}^T \mathbf{c}$
computation, gives $\mathbf{h}_2 = f(\mathbf{U}^T \mathbf{c})$
second matrix, $\mathbf{V}$ projects $\mathbf{h}_2$ to a vector $\mathbf{h}_3$ containing one score per vocabulary. $\mathbf{h}_3 = \mathbf{V}\mathbf{h}_2$
Vector of scores then converted to vector of probability values, using softmax.
$\mathbf{U} \in \mathbb{R}^{C\, \times\, H}$
$\mathbf{h}_2 \in \mathbb{R}^h$.
$\mathbf{V} \in \mathbb{R}^{C\,\times\,H}$