Suppose we have a population of objects, each represented by a vector of features. Similar objects have similar feature vectors. We get only a few positive examples of a concept--say "dog"--that we want generalize correctly. Which of the other objects are dogs?
The model below is a fuzzy variant of "Bayesian concept learning", as proposed by Josh Tenenbaum in his dissertation (ref). Rather than assuming the objects are either in or not in the extension of the to-be-learned concept, we assume that each has some degree of membership. The probability that an object is drawn as an example is proportional to its degree of "in-ness".
We populate a hash table whose keys are object name strings and whose values are eleven-dimensional vectors of real numbers. Vectors for 96 objects have been constructed from about 2000 rdf (subject-verb-object) triples describing them, by means that would take us too far afield to discuss. Suffice it to say that an iterative algoritm ensures that objects that occur in similar contexts have similar vectors.
TODO: fix the csv utils so I don't have to do all the whitespace trimming.
In [1]:
(require gamble
gamble/util/csv
racket/string
racket/vector
"c3_helpers.rkt"
racket/list)
(define object-codes
(make-hash
(map
(lambda (row)
(cons (string-trim (vector-ref row 0))
(map (lambda (n)
(string->number
(string-trim n)))
(vector->list (vector-drop row 1)))))
(read-csv-file "toy_vectors.csv"))))
(define n-features 11)
In [15]:
;; Cosine.
(define (similarity obj1 obj2)
(let ([v1 (hash-ref object-codes obj1)]
[v2 (hash-ref object-codes obj2)])
(let ([l1 (sqrt (apply + (map * v1 v1)))]
[l2 (sqrt (apply + (map * v2 v2)))])
(/ (apply + (map * v1 v2)) (* l1 l2)))))
In [16]:
(printf "
dog1 (individual) / dog2 (individual):\t\t~a\n
dog1 (individual) / dog (class in):\t\t~a\n
dog1 (individual) / subPropertyOf (2n-ord rel):\t~a\n
theft (class) / transfer (super-class):\t~a"
(/ (round (* 10000 (similarity "toy_dog1" "toy_dog2"))) 10000)
(/ (round (* 10000 (similarity "toy_dog1" "toy_Dog"))) 10000)
(/ (round (* 10000 (similarity "toy_dog1" "rdfs_subPropertyOf"))) 10000)
(/ (round (* 10000 (similarity "toy_Theft" "toy_Transfer"))) 10000))
We see several examples, which are assumed to be drawn thusly:
First each element of the concept mean vector $\mu^c$ is drawn from an independent Gaussian.
For each dimension, a concept-specific precision (a length-scale or importance weight) $\tau^c_j$, is drawn from a mixture of a spike at zero (feature is irrelevant) and a vague gamma distribution.
The item's degree of in-ness is exponentially decreasing in its weighted city-block distance from the concept mean:
Inference yields in-nesses for several objects, given weights sampled from the posterior, which will favor concentrating in-ness on the examples, and (within the strong smoothness constraint imposed by the model form) avoiding weight on non-examples.
In [17]:
(define bcl-sampler
(mh-sampler
;;;;;;; Generative model ;;;;;;;;;
(deflazy p-relevant (beta 2 2))
;; The mean and precision vectors that define the concept.
(defmem (precision concept feature) (if (flip p-relevant) 0.000000001 (gamma 0.6 12.0)))
(defmem (mean concept feature) (normal 0 8))
;; In-ness decreases exponentially with city-block distance from concept means,
;; with block directions scaled by precisions.
(define (in-ness object concept)
(let ([obj-ftrs (hash-ref object-codes object)])
(+ 0.0000000000001
(exp
(for/sum ([i (length obj-ftrs)])
(let ([ftr-diff (- (mean concept i) (list-ref obj-ftrs i))])
(* -1 (precision concept i) (abs ftr-diff))))))))
;(* -1 (precision concept i) (* ftr-diff ftr-diff))))))))
;; The set of in-ness-es that defines the concept's discrete distribution.
(defmem (weighted-objects concept)
(map
(lambda (obj)
(cons obj (in-ness obj concept)))
(hash-keys object-codes)))
;; Drawing examples from that distribution.
(defmem (examples concept k) (discrete (weighted-objects concept)))
;;;;;;;;; Observations ;;;;;;;;;;;
(observe (examples "chien" 1) "toy_dog1")
(observe (examples "chien" 2) "toy_dog2")
(observe (examples "chien" 3) "toy_dog11")
(observe (examples "chien" 4) "toy_dog12")
(observe (examples "chien" 5) "toy_dog4")
(observe (examples "chien" 6) "toy_dog14")
(observe (examples "personne" 1) "toy_person1")
(observe (examples "personne" 2) "toy_person2")
(observe (examples "personne" 3) "toy_person11")
(observe (examples "personne" 4) "toy_person12")
(observe (examples "personne" 5) "toy_person4")
(observe (examples "personne" 6) "toy_person14")
(observe (examples "personne" 7) "toy_person23")
(observe (examples "personne" 8) "toy_person33")
(observe (examples "transfert" 1) "toy_giveEvt1")
(observe (examples "transfert" 2) "toy_theft1")
;;;;;;;;; Query ;;;;;;;;;;;;;;;
(vector
(in-ness "toy_dog4" "chien")
(in-ness "toy_dog3" "chien")
(in-ness "toy_dog13" "chien")
(in-ness "toy_person3" "chien")
(in-ness "toy_person13" "chien")
(in-ness "toy_theft1" "chien")
(in-ness "toy_giveEvt2" "chien")
(in-ness "rdfs_subPropertyOf" "chien")
(in-ness "toy_dog4" "transfert")
(in-ness "toy_dog3" "transfert")
(in-ness "toy_dog13" "transfert")
(in-ness "toy_person3" "transfert")
(in-ness "toy_person13" "transfert")
(in-ness "toy_theft1" "transfert")
(in-ness "toy_giveEvt2" "transfert")
(in-ness "rdfs_subPropertyOf" "transfert")
(in-ness "toy_dog4" "personne")
(in-ness "toy_dog3" "personne")
(in-ness "toy_dog13" "personne")
(in-ness "toy_person3" "personne")
(in-ness "toy_person13" "personne")
(in-ness "toy_theft1" "personne")
(in-ness "toy_giveEvt2" "personne")
(in-ness "rdfs_subPropertyOf" "personne"))))
In [22]:
(define smpls (sampler->mean bcl-sampler 50 #:burn 10000 #:thin 2000))
As we expect (or hope), generalization is strongest to dogs (the left-most three bars, only the first of which is an example), and next-strongest to people (the next two bars to the right). Generalization is very low to a theft event and a gift event (the next two), and non-existent to the abstract property "subPropertyOf".
This is significant, if simple, learning from just a few examples, with no negative examples at all, based on a model no more complex than logistic regression.
In [19]:
(bar-c3-categorical
(list 1 2 3 4 5 6 7 8)
(take (vector->list smpls) 8)
(list "dog4" "dog3" "dog13" "person3" "person13" "theft1" "gift2" "subPropertyOf")
#:xlabel "object"
#:ylabel "mean in-ness")
Out[19]:
In [20]:
(bar-c3-categorical
(list 1 2 3 4 5 6 7 8)
(take (drop (vector->list smpls) 8) 8)
(list "dog4" "dog3" "dog13" "person3" "person13" "theft1" "gift2" "subPropertyOf")
#:xlabel "object"
#:ylabel "mean in-ness")
Out[20]:
In [21]:
(bar-c3-categorical
(list 1 2 3 4 5 6 7 8)
(drop (vector->list smpls) 16)
(list "dog4" "dog3" "dog13" "person3" "person13" "theft1" "gift2" "subPropertyOf")
#:xlabel "object"
#:ylabel "mean in-ness")
Out[21]:
In [ ]: