FrameNet

I see three ways of getting features from FrameNet:

Does word $j$ evoke frame $i$?
Something with frame relations
Something with frame elements



In [190]:

    
import numpy as np
import pandas as pd
from nltk.corpus import framenet as fn

1. Does word $j$ evoke frame $i$?

In sum: too sparse to use.



In [2]:

    
def get_lus(frame):
    """Helper to get lexemes from frame."""
    lus = frame['lexUnit'].keys()
    return [k.partition('.')[0] for k in lus]



In [3]:

    
all_frames = fn.frames('.*')
all_frame_names = [f.name for f in all_frames]
all_lus = [get_lus(f) for f in all_frames]
all_lus = [item for sublist in all_lus for item in sublist]
all_lus = list(set(all_lus))



In [182]:

    
evoke = pd.DataFrame(0, index=all_frame_names, columns=all_lus)
for frame in all_frames:
    name = frame.name
    lus = get_lus(frame)
    for lu in lus:
        evoke[lu][name] += 1

Most words evoke one frame, some two, few three.



In [5]:

    
evoke.max().value_counts()









    Out[5]:





1    9014
2     402
3       5
dtype: int64



In [6]:

    
evoke.head()









    Out[6]:







  
    
      
      posse
      find out
      tun
      mortification
      reliance
      monthly
      pilfer
      speak
      jerk
      weigh anchor
      ...
      wet
      jumble
      honk
      revelation
      tenement
      data
      predestined
      rainfall
      recurrence
      reminder
    
  
  
    
      Abandonment
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Abounding_with
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Absorb_heat
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Abundance
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Abusing
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
  

5 rows × 9421 columns

2. Frame relations

In this approach, I represent a word by a bit vector indicating whether or not that word evokes a frame or evokes a frame that inherits from a frame.



In [77]:

    
def evokes(frame):
    """Return words that evoke `frame`."""
    lus = frame['lexUnit'].keys()
    return [k.partition('.')[0] for k in lus]



In [38]:

    
def is_inheritance_relation(relation):
    return relation['type']['name'] == 'Inheritance'



In [101]:

    
def is_parent_frame(frame, relation):
    return frame.name == relation.superFrameName



In [146]:

    
def children(frame):
    """Return children of `frame`."""
    relations = frame.frameRelations
    relations = [r for r in relations if is_inheritance_relation(r)]
    relations = [r for r in relations if is_parent_frame(frame, r)]
    return [fn.frame(r.subFrameName) for r in relations]



In [104]:

    
def flatten(lst):
    return [item for sublist in lst for item in sublist]



In [147]:

    
def words(frame):
    """Return all words that evoke `frame`, including words that
    evoke frames that inherit from `frame`."""
    kids = children(frame)
    if not kids:
        return evokes(frame)
    evoke_sub_frames = [words(f) for f in kids]
    return evokes(frame) + flatten(evoke_sub_frames)



In [183]:

    
relations = pd.DataFrame(0, index=all_frame_names, columns=all_lus)
for frame in all_frames:
    name = frame.name
    lus = words(frame)
    for lu in lus:
        relations.loc[name, lu] += 1



In [187]:

    
relations.head()









    Out[187]:







  
    
      
      posse
      find out
      tun
      mortification
      reliance
      monthly
      pilfer
      speak
      jerk
      weigh anchor
      ...
      wet
      jumble
      honk
      revelation
      tenement
      data
      predestined
      rainfall
      recurrence
      reminder
    
  
  
    
      Abandonment
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Abounding_with
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Absorb_heat
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Abundance
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
    
      Abusing
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
      ...
      0
      0
      0
      0
      0
      0
      0
      0
      0
      0
    
  

5 rows × 9421 columns



In [196]:

    
(relations.size - np.count_nonzero(relations.values))/relations.size









    Out[196]:





0.9970414779882989



In [205]:

    
relations.sum(axis=1).sort_values(ascending=False).head()









    Out[205]:





Event                  5124
Intentionally_act      1919
Objective_influence    1918
Transitive_action      1891
Attributes             1673
dtype: int64



In [211]:

    
relations.loc['Transitive_action'].sort_values(ascending=False).head()









    Out[211]:





make      6
strike    6
cut       6
fire      5
tie       5
Name: Transitive_action, dtype: int64



In [209]:

    
relations.to_csv('framenet-relations.csv')



In [226]:

    
normalized_relations = relations / relations.sum()
normalized_relations.to_csv('framenet-normalized-relations.csv')

	posse	find out	tun	mortification	reliance	monthly	pilfer	speak	jerk	weigh anchor	...	wet	jumble	honk	revelation	tenement	data	predestined	rainfall	recurrence	reminder
Abandonment	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
Abounding_with	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
Absorb_heat	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
Abundance	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
Abusing	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0