12. Semantics 2: common tasks in computational semantics

12.1 Overview

12.1 Overview

Three common tasks in computational semantics:

  • Relation extraction (RE) - obtaining structured information from language data. E.g. parsing CVs to build a database for recruitment
  • Sentiment analysis (SA) - detecting attitudes and opinions in text, e.g. user reviews of movies, products, etc.
  • Question answering (QA) - detecting a particular information need in user input and fulfilling it based on some text

12.2 Relation extraction

obtaining structured information from language data

RE systems typically target a particular domain, e.g.:

  • professional profile information based on CVs, Linkedin pages
  • product specifications based on e-commerce sites (webshops), manufacturers' websites
  • stock price information based on news articles
  • etc.

Example

"Gryffindor values courage, bravery, nerve, and chivalry. Gryffindor's mascot is the lion, and its colours are scarlet and gold. The Head of this house is the Transfiguration teacher and Deputy Headmistress, Minerva McGonagall until she becomes headmistress, and the house ghost is Sir Nicholas de Mimsy-Porpington, more commonly known as Nearly Headless Nick. According to Rowling, Gryffindor corresponds roughly to the element of fire. The founder of the house is Godric Gryffindor."

  • values(Gryffindor, courage)
  • mascot(Gryffindor, lion)
  • color(Gryffindor, scarlet)
  • head(Gryffindor, Minerva_McGonagall)
  • house_ghost(Gryffindor, Sir_Nicholas_de_Mimsy-Porpington)
  • founder(Gryffindor, Godric_Gryffindor)

Rule-based approaches

Templates:

  • X dropped by Y points -> drop_by(X, Y)
  • X, CEO of Y -> ceo_of(X, Y)
  • X was born in Y -> born_in(X, Y)

If parsers/NER-taggers/Chunkers are available, templates can refer to their output:

  • X_NP dropped by Y_NUM points -> drop_by(X, Y)

or

  • X_PERSON, CEO of Y_ORGANIZATION -> ceo_of(X, Y)
  • X_PERSON was born in Y_LOCATION -> born_in(X, Y)

Pros:

  • simple and effective, yields fast results
  • high precision

Cons:

  • low recall
  • limited, no capacity for generalization
  • real-life systems may contain thousands of templates, and require continuous development by experts
  • many companies still depend on such systems

Supervised learning

Use parsed and annotated text to train text classifiers

E.g. decide for each pair of named entities (PERSON and ORGANIZATION) whether they are in the "ceo_of" relationship, based on context features

Supervised learning

Features typically include:

  • headwords
  • word/POS ngrams with position information
  • NER/Chunk tags, Chunk sequences
  • Paths in parse trees among candidates (e.g. N - NP - S - VP - NP - N)
  • gazetteer features: whether words/phrases appear on an external list of known entities

Pros:

  • Effective if large training sample is available and target texts are similar

Cons:

  • requires a fair amount of annotated data (costly to produce)
  • doesn't generalize well across genres, domains

Semi-supervised / unsupervised approaches

When little or no training data is available, we must use what we have to generalize:

  • a few annotated examples -> some patterns -> more examples -> more patterns
  • a few patterns -> some examples -> more patterns -> more examples

Example

seed tuple: author(William_Shakespeare, Hamlet)

found instances:

  • William Shakespeare's Hamlet
  • the William Shakespeare play Hamlet
  • Hamlet by William Shakespeare
  • Hamlet is a tragedy written by William Shakespeare

extracted patterns:

  • X's Y
  • the X play Y
  • Y by X
  • Y is a tragedy written by X

Finally, use these patterns to find new seeds

12.3 Sentiment analysis

Also called opinion mining - the task of extracting opinions, emotions, attitudes from user-generated text, e.g. about products, movies, or politics

  • Simplest version: is the attitude of a text positive or negative
  • More complex: measure attitude on a scale (e.g. from 1 to 5)
  • Most complex: detect target of opinion (e.g. what product is it about) or aspect (e.g. is it about the price, looks, or quality of a product)

Baseline approach:

Use training data, extract standard features such as:

  • bag-of-words
  • ngrams
  • emoticons
  • numbers, dates
  • gazetteer features (based on Sentiment lexicons)

Use these to train standard classifiers, e.g. Naive Bayes, SVM, MaxEnt, etc.

Example

Advanced techniques

  • use semi-supervised methods to learn sentiment lexicons
  • model negation explicitly

12.4 Question answering

One of the oldest and most popular tasks in AI

Recent products include Apple Siri, Amazon Alexa, or IBM's Watson

Major approaches:

  • IR-based: handle questions as search queries (e.g. Watson, Google)

  • Knowledge-based: convert question into a Relation Extraction task (e.g. Watson, Siri, Wolfram Alpha)

IR-based approaches:

  • detect question type
  • generate search queries from questions
  • retrieve ranked documents
  • extract relevant passages, rerank
  • extract answer candidates
  • rank answers

Question processing:

They’re the two states you could be reentering if you’re crossing Florida’s northern border

  • Answer Type: US state
  • Query: two states, border,Florida,north
  • Focus: the two states
  • Relations: borders(Florida,?x,north)

Answer type detection

Supervised learning can be used to train a classifier on annotated data. See also Li & Roth 2002

Keyword extraction

From a slide by Mihai Surdenau

Answer extraction

  • matching question and answer type
  • this'll yield several answer candidates that need to be ranked

Answer ranking

Some features for learning to rank:

  • Answer type match: Candidate contains a phrase with the correct answer type.
  • Pattern match: Regular expression pattern matches the candidate.
  • Question keywords: # of question keywords in the candidate.
  • Keyword distance: Distance in words between the candidate and query keywords
  • Novelty factor: A word in the candidate is not in the query.
  • ...

Approaches to ranking can be:

  • pointwise
  • pairwise
  • listwise

see Agarwal et al. 2012 for a short survey of algorithms

Knowledge-based approaches:

  • Create semantic representation of query (understand what is being asked!)

Whose granddaughter starred in E.T.?

(acted-in ?x “E.T.”)

(granddaughter-of ?x ?y)

  • query relevant databases and ontologies

Hybrid systems

  • candidate answers are generated with IR-based methods, using shallow semantic represenations
  • answers are reranked using knowledge-based methods