We have seen ways to analyze, parse, annotate, translate, etc. written text...
...but without modeling its meaning
Semantics is the study of meaning
Semantics is the study of meaning, and in linguistics and NLP it refers to the meaning of linguistic utterances. Applications covered in this course so far are all possible without any explicit representation of what certain linguistic structures mean. Even complex processes like machine translation can be ignorant of what each word, phrase or sentence means, i.e. what information it conveys to speakers of the language.
Since the semantics of an utterance can be thought of as a mapping from linguistic structures to a representation of the world, it is closely connected with the fields of philosophy, logic, and knowledge representation. Some consider semantics to be AI-complete, i.e. at least as difficult as modeling human cognition.
Many levels of language processing can benefit from analyzing meaning:
I made spagetti with meatballs
I made spagetti with my sister
I shot an elephant in my pajamas
Other tasks rely on semantics so heavily, they are considered semantic technologies:
the process of generating adequate answers to user's questions, based on some knowledge of the world:
deciding whether one statement implies another or not
Bikel, D., & Zitouni, I. (2012). Multilingual natural language processing applications: from theory to practice. IBM Press.
enabling computers to understand what is on the internet and what you can do on the internet
such as Apple Siri, Amazon Alexa, or Google Now
Systems that can, to some extent, carry human-like conversations
The task of mapping linguistic units to some representation of their meaning
(We'll see some examples in 2 weeks)
Semantic analysis is the process of determining the meaning of linguistic units. While all technologies introduced so far can benefit from such analyses, the ones discussed in this and the following two lectures are outright impossible unless we build explicit representations of the information content of linguistic data. Mapping linguistic data to some representation of meaning requires us to choose a semantic representation
Today there exist dozens of different theories and systems of semantic representation. In syntactic or morphological analysis, there are theoretical concepts that are widely accepted by linguists and used by engineers, such as the constituent structure of a sentence or the concept of verb tense. There is no such agreement on the basic elements of semantic representation.
In a narrow sense, semantic analysis involves modeling the meaning of a sentence only as far as it can be determined without knowing the context in which it is uttered, the previous knowledge (information state) of each speaker, etc. Detecting meaning as a function of linguistic form only is sometimes called syntax-driven semantic analysis or semantic parsing.
In the broader sense, semantic analysis involves modeling all new information that an utterance conveys, and thus includes the process of inference. In this case the analysis of the sentence What did you do today? should at least be aware of the identity of the person this question was addressed to and the exact time when the language was uttered. It is far from trivial to define the limits of this broader process: the true scope of such a question is actually determined by factors such as the nature of the relationship between speakers (the answer is different if the question is asked by someone's boss or by a friend) or the history of interactions between them. The field of linguistics concerned with such factors is called pragmatics.
We know that dog is a singular common noun, but how do we distinguish it from cat, television, Monday, or peace?
Two major approaches:
dog: animal, four-legged, faithful, barks
peace: period, no war
Two words are similar in meaning if they appear in similar contexts. Typically we represent words using real-valued vectors in a way that Euclidean distance between vectors is proportional to the similarity of contexts.
Q: are there "perfect synonyms", ever, in any language? Depends on our definition of meaning!
A word is a hypernym of another if it is a broader or more general concept of which the other is a special case, e.g. mammal is the hypernym of dog, rectangle is the hypernym of square.
We also say that dog is a hyponym of mammal and square is a hyponym of rectangle.
Q: in what way is this similar to the IS_A relationship in programming?
Bank (as in financial institution) and bank (as in the bank of a river) are homonyms (they are spelled the same but have very different meanings)
Q: glass (material) and glass (dish) are not homonyms, but why?
Two and too are homophones, which means they are pronounced the same
Some examples are:
A resource based on Frame Semantics (see e.g. Fillmore & Baker 2001)
Frames are script-like structures that represent a situation, event or object, and lists its typical participants or props, which are called event roles
Has been used to train semantic parsers / semantic role labelers, e.g. SEMAFOR
Measure the degree to which the meaning of two words are similar
e.g. cat and dog are more similar than cat and car
Not a precise definition - that would require a model of meaning
Datasets are created based on the human intuition of hundreds of annotators
various NLP tasks benefit from a similarity metric, e.g. machine translation, info retrieval (search).
for any task, extra data for rare words may be obtained through similar but more frequent words
models of word meaning can be evaluated based on their inherent concept of semantic distance/similarity
cosine similarity of word vectors is expected to be proportional to semantic similarity
e.g. nearest neighbors in the glove.6B.50d embedding:
words closest to king:
prince | 0.824 |
queen | 0.784 |
ii | 0.775 |
emperor | 0.774 |
son | 0.767 |
words closest to dog:
cat | 0.922 |
dogs | 0.851 |
horse | 0.791 |
puppy | 0.775 |
pet | 0.772 |
Not as reliable with less frequent words, e.g. opossum:
four-eyed | 0.752 |
raccoon | 0.717 |
songbird | 0.704 |
Or woodpecker:
pileated | 0.805 |
ivory-billed | 0.72 |
red-cockaded | 0.71 |
Distance between words in lexical graphs such as WordNet is also used as a source of semantic similarity
Path similarity in wordnet between dog and some other synsets:
canine | 0.5 |
wolf | 0.33 |
cat | 0.2 |
refrigerator | 0.07 |