13. Semantics 3: Formalisms for semantic representation. Semantic parsing.

13.3 Graph-based representations

13.0 Introduction

"Spoken words … are symbols of the affections of the soul …" (Aristotle: Die Interpretatione I)

Recall from Week 12: there's no consensus on what semantic representations should be like.

NLP applications use various resources to capture some aspects of meaning, but there aren't formal theories of semantics that are universally useful.

A true theory of semantics would be a mapping from linguistic structures to the domain of "knowledge" or "reality" or "all the things we can talk about", so really it would be a theory of everything.

An artificial semantics engine would then be capable of artificial general intelligence

What can be expected of a theory of semantics?

"A semantic theory describes and explains the interpretative ability of speakers by accounting for their performance in determining the number and content of the readings of a sentence, by detecting semantic anomalies, by deciding on paraphrase relations between sentences, and by marking every other semantic property or relation that plays a role in this ability." (Katz & Fodor 1963)

"our goal must be to develop a theory capable of handling the kind of commonsensical inferences that people routinely, automatically, and generally subconsciously make when answering simple questions about simple stories" (Kornai, to appear)

13.1 Distributional models

(see also Week 11)

semantic representations are real-valued vectors

vectors are constructed from large corpora, with the objective that Euclidean distance is proportional to distributional similarity

that this is a representation of semantics is based on the Distributional Hypothesis: that words appearing in similar contexts are similar in meaning

Distributional representations are also called word embeddings and are currently used in virtually all state-of-the-art NLP systems to represent word meaning

Issues

research has just begun into distributional representations of units larger than a word, consequently there is (almost) no method to construct the meaning of a structure from the meaning of its parts

we know nothing (or very little) about the structure of these representations. This makes it very hard to understand why they work - or why they don't

in particular we have no reliable way of measuring the quality of these representations - Katz & Fodor's principles

13.2 Logical semantics

Semantic representations are logical formulae

>90% of semantics research in the field of theoretical linguistics

Zero-order logic

(a.k.a. propositional calculus)

atomic statements and logical connectives only

It is either raining or snowing: $r \vee s$
It is raining and not snowing: $r \wedge \neg s$

First-order logic

(FOL, a.k.a. predicate logic)

predicates have arguments, which are entities
quantfication over entities w.r.t predicates

John is eating: $E(j)$
John or Mary is eating $E(j) \vee E(m)$

Someone is eating $\exists x: E(x)$ (existential quantification)
Everybody is eating $\forall x: E(x)$ (universal quantification)

Everybody is eating something $\forall x \exists y: E(x, y)$

Some issues (1)

Some phenomena seem easy to handle, but only at first glance:

Red ball: $R(x) \wedge B(x)$

This looks good: Red balls are things that are both red and ball

Large flea: $L(x) \wedge F(x)$

This is problematic: what is $L$? How can you tell if something is large?

Some issues (2)

Logical semantics has nearly nothing to say about word meaning

The bulk of research deals with the analyis of complex structures that are only rarely found in natural language

13.3 Graph-based formalisms

semantic representations are networks of concepts

dates back to the earliest days of computational linguistics

hasn't penetrated everyday applications

new interest with AMR

Abstract Meaning Representation (AMR)

Example: The boy wants to go

no account of word meaning

Annotated dataset (sembank) available: https://amr.isi.edu/download.html

4lang

Example: Harry snatched up his wand and scrambled to his feet

4lang

a word's meaning is also a graph:

No annotated sembank, but thousands of gold word definitions are available: https://github.com/kornai/4lang/blob/master/4lang

13.4 Semantic parsing

The task of assigning semantic representation to raw text

not a well-defined task (what representation?), no commonly used systems

AMR:

Graph grammars operating on words of a sentence

Topic of a 2016 Semeval Task: Abstract Meaning Representation Parsing and Generation

4lang:

- text_to_4lang: simple mapping from dependency parses to 4lang graphs: