In [2]:
import __init__

import pandas as pd

from bin.splitLogFile import extractSummaryLine

Experience

Based on wordnet as ground truth, we tried to learn a classifier to detect taxonomyc relations between words.
We use here to animal domain (feline > cat / animal > mammal).

We try here 2 approaches:

  • direct parent
  • belonging to a common branch

To do so we will explore the carthesian product of:

  • full: check for direct parent or belonging to a common branch
  • strict: try to compose missing concept
  • randomForest / knn: knn allow us to check if there is anything consistent to learn, randomForest is a basic model as a first approach to learn the function
  • feature: one of the feature presented in the guided tour
  • postFeature: any extra processing to apply to the feature extraction (like normalise)

We use a 10 K-Fold cross validation.

Negative sampling is generating by shuffling pairs.

We restrict here to the animal domain.

Once you downloaded the files, you can use this script reproduce the experience at home:

python experiment/trainAll_taxoClf.py > ../data/learnedModel/taxo/log.txt

To build other taxonomy dataset you can use the following script:

python script/extractWordnetTaxo.py vehicle > /tmp/wordnetTaxo_vehicle.txt

Results

Here is the summary of the results we gathered, You can find details reports in logs.


In [29]:
summaryDf = pd.DataFrame([extractSummaryLine(l) for l in open('../../data/learnedModel/taxo/summary.txt').readlines()],
                        columns=['full', 'strict', 'clf', 'feature', 'post', 'precision', 'recall', 'f1'])

summaryDf = summaryDf[summaryDf['clf'] != 'KNeighborsClassifier']

directDf = summaryDf[summaryDf['full'] == 'animal']
belongDf = summaryDf[summaryDf['full'] == 'animalAll']

Considering the nature of the date (really close points for semanticly close concept - ie puppy, dog), Knn is not relevant.

Since the nature of the function trained for belonging in chain and direct parent are really different, we'll separate the study of the results.

Direct parent

Results


In [30]:
directDf.sort_values('f1', ascending=False)[:10]


Out[30]:
full strict clf feature post precision recall f1
70 animal RandomForestClassifier subPolar noPost 0.843 0.843 0.843
66 animal RandomForestClassifier subAngular noPost 0.837 0.836 0.836
68 animal RandomForestClassifier subCarth noPost 0.828 0.825 0.825
64 animal RandomForestClassifier concatPolar noPost 0.814 0.813 0.813
62 animal RandomForestClassifier concatCarth noPost 0.812 0.808 0.808
65 animal RandomForestClassifier concatPolar postNormalize 0.804 0.803 0.803
90 animal strict RandomForestClassifier subAngular noPost 0.807 0.803 0.803
63 animal RandomForestClassifier concatCarth postNormalize 0.802 0.798 0.798
84 animal strict RandomForestClassifier concatAngular noPost 0.8 0.797 0.797
60 animal RandomForestClassifier concatAngular noPost 0.796 0.796 0.795

Thus, we can observe quite good f1-score on RandomForest with polar substraction and no normalisation.
it is ok to not have normalisation considering decision tree algorithm

Angular substraction also provide good results, conforting the idea of the angle of concept pointing the semantic direction.

Study errors

Here is the detail of:

  • False positive - ie: incorrect direct parent relation, reason can be:
    • wrong branch
    • branch correct but not the direct parent
  • False negative - ie: not detected parent relation

In [24]:
!python ../../toolbox/script/detailConceptPairClfError.py ../../data/voc/npy/wikiEn-skipgram.npy ../../data/learnedModel/taxo/animal__RandomForestClassifier_subPolar_noPost.dill ../../data/wordPair/wordnetTaxo_animal.txt isParent ../../data/wordPair/wordnetTaxo_animal_fake.txt notParent


1388424 loaded from wikiEn-skipgram
mem usage 1.6GiB
loaded time 2.35443711281 s
input: (chordate, '>', cephalochordate)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48115061  0.51884939]
input: (chordate, '>', tunicate)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.3998105  0.6001895]
input: (salamander, '>', newt)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47239304  0.52760696]
input: (lungfish, '>', ceratodus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48341822  0.51658178]
input: (sturgeon, '>', beluga)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44348526  0.55651474]
input: (characin, '>', piranha)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40517024  0.59482976]
input: (cyprinid, '>', goldfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47709308  0.52290692]
input: (goldfish, '>', silverfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46178254  0.53821746]
input: (killifish, '>', flagfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46683791  0.53316209]
input: (eel, '>', tuna)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45109075  0.54890925]
input: (salmonid, '>', char)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49771768  0.50228232]
input: (jack, '>', kingfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37587086  0.62412914]
input: (jack, '>', leatherjacket)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49220581  0.50779419]
input: (jack, '>', rudderfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37651949  0.62348051]
input: (jack, '>', threadfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41396697  0.58603303]
input: (jack, '>', yellowtail)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.38970987  0.61029013]
input: (moonfish, '>', lookdown)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47782711  0.52217289]
input: (pike, '>', muskellunge)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43403868  0.56596132]
input: (pike, '>', pickerel)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39904773  0.60095227]
input: (whiting, '>', kingfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46893271  0.53106729]
input: (wrasse, '>', hogfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40703463  0.59296537]
input: (wrasse, '>', pigfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45037032  0.54962968]
input: (poacher, '>', alligatorfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42000073  0.57999927]
input: (rockfish, '>', rosefish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48993631  0.51006369]
input: (sculpin, '>', bullhead)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46421108  0.53578892]
input: (ray, '>', manta)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45032047  0.54967953]
input: (ray, '>', skate)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.36368712  0.63631288]
input: (vertebrate, '>', bird)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37025799  0.62974201]
input: (auk, '>', auklet)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42472906  0.57527094]
input: (auk, '>', razorbill)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45192012  0.54807988]
input: (jaeger, '>', skua)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.3748217  0.6251783]
input: (swan, '>', cob)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49678772  0.50321228]
input: (swan, '>', trumpeter)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4797649  0.5202351]
input: (heron, '>', bittern)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41212994  0.58787006]
input: (heron, '>', boatbill)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.411793  0.588207]
input: (heron, '>', egret)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43889575  0.56110425]
input: (crake, '>', corncrake)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46261545  0.53738455]
input: (rail, '>', notornis)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47232949  0.52767051]
input: (rail, '>', weka)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46203487  0.53796513]
input: (sandpiper, '>', sanderling)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46081496  0.53918504]
input: (tattler, '>', willet)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48855671  0.51144329]
input: (duck, '>', duckling)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48955418  0.51044582]
input: (duck, '>', widgeon)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49344602  0.50655398]
input: (hawk, '>', buteonine)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46161774  0.53838226]
input: (falcon, '>', gyrfalcon)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43089664  0.56910336]
input: (goatsucker, '>', whippoorwill)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40178391  0.59821609]
input: (kingfisher, '>', kookaburra)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48382754  0.51617246]
input: (cuckoo, '>', ani)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44008436  0.55991564]
input: (pigeon, '>', dove)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46549218  0.53450782]
input: (lory, '>', lorikeet)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44684555  0.55315445]
input: (finch, '>', canary)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45670787  0.54329213]
input: (lark, '>', skylark)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45135234  0.54864766]
input: (starling, '>', myna)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.36645652  0.63354348]
input: (nightingale, '>', bulbul)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.3571031  0.6428969]
input: (thrush, '>', solitaire)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45267228  0.54732772]
input: (kinglet, '>', goldcrest)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48939165  0.51060835]
input: (weaver, '>', avadavat)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45634782  0.54365218]
input: (weaver, '>', baya)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47778077  0.52221923]
input: (fetus, '>', monster)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42179055  0.57820945]
input: (phalanger, '>', koala)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48688032  0.51311968]
input: (porpoise, '>', vaquita)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48653252  0.51346748]
input: (placental, '>', carnivore)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.36395872  0.63604128]
input: (carnivore, '>', canine)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4368822  0.5631178]
input: (coonhound, '>', coondog)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41897177  0.58102823]
input: (canine, '>', fox)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41406306  0.58593694]
input: (canine, '>', wolf)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49293131  0.50706869]
input: (coyote, '>', coydog)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41397777  0.58602223]
input: (carnivore, '>', feline)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46899135  0.53100865]
input: (feline, '>', cat)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44343074  0.55656926]
input: (wildcat, '>', margay)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.488928  0.511072]
input: (marten, '>', sable)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47265833  0.52734167]
input: (placental, '>', lagomorph)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42986939  0.57013061]
input: (pachyderm, '>', elephant)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37633418  0.62366582]
input: (elephant, '>', gomphothere)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47912314  0.52087686]
input: (placental, '>', primate)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45348048  0.54651952]
input: (gorilla, '>', silverback)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42001112  0.57998888]
input: (lemur, '>', potto)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43303178  0.56696822]
input: (langur, '>', entellus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.29679314  0.70320686]
input: (elephant, '>', gomphothere)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47912314  0.52087686]
input: (placental, '>', ungulate)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.38378765  0.61621235]
input: (bovid, '>', antelope)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48858069  0.51141931]
input: (antelope, '>', gnu)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46854381  0.53145619]
input: (antelope, '>', nilgai)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47388905  0.52611095]
input: (waterbuck, '>', kob)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43538194  0.56461806]
input: (waterbuck, '>', lechwe)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.34558927  0.65441073]
input: (brahman, '>', zebu)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47800508  0.52199492]
input: (cattle, '>', grade)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48536161  0.51463839]
input: (ox, '>', aurochs)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48669727  0.51330273]
input: (ox, '>', banteng)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43426932  0.56573068]
input: (giraffe, '>', okapi)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49989336  0.50010664]
input: (hog, '>', porker)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49826454  0.50173546]
input: (horse, '>', racehorse)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46526024  0.53473976]
input: (warhorse, '>', charger)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48053883  0.51946117]
input: (warhorse, '>', steed)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46155201  0.53844799]
input: (theropod, '>', carnosaur)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47583497  0.52416503]
input: (ceratosaur, '>', coelophysis)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48666269  0.51333731]
input: (maniraptor, '>', deinonychus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48376826  0.51623174]
input: (ornithomimid, '>', deinocheirus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44993418  0.55006582]
input: (lizard, '>', iguanid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48415174  0.51584826]
input: (whiptail, '>', racerunner)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41661119  0.58338881]
input: (boa, '>', python)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43413586  0.56586414]
input: (cobra, '>', asp)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43837126  0.56162874]
input: (cobra, '>', hamadryad)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47128658  0.52871342]
input: (rattlesnake, '>', massasauga)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48144849  0.51855151]
input: (rattlesnake, '>', sidewinder)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47818473  0.52181527]
input: (coonhound, '>', coondog)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41897177  0.58102823]
input: (acarine, '>', mite)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46235592  0.53764408]
input: (mite, '>', trombiculid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39776716  0.60223284]
input: (calosoma, '>', searcher)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4680815  0.5319185]
input: (ladybug, '>', vedalia)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47190016  0.52809984]
input: (fly, '>', blowfly)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.31416357  0.68583643]
input: (fly, '>', gadfly)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41024383  0.58975617]
input: (gadfly, '>', botfly)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4149153  0.5850847]
input: (gadfly, '>', horsefly)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39114758  0.60885242]
input: (fly, '>', housefly)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45212844  0.54787156]
input: (gnat, '>', midge)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.368165  0.631835]
input: (flea, '>', sticktight)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49752571  0.50247429]
input: (bee, '>', drone)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47556297  0.52443703]
input: (wasp, '>', vespid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39452385  0.60547615]
input: (nymphalid, '>', comma)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48346999  0.51653001]
input: (fritillary, '>', silverspot)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49367799  0.50632201]
input: (moth, '>', gelechiid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4099836  0.5900164]
input: (coelenterate, '>', anthozoan)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47416823  0.52583177]
input: (coral, '>', gorgonian)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40982447  0.59017553]
input: (coelenterate, '>', polyp)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45858288  0.54141712]
input: (bivalve, '>', clam)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48142431  0.51857569]
input: (mollusk, '>', cephalopod)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41335634  0.58664366]
input: (octopod, '>', octopus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48308009  0.51691991]
input: (seasnail, '>', triton)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47387455  0.52612545]
input: (annelid, '>', oligochaete)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45693882  0.54306118]
input: (polychaete, '>', bloodworm)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42179991  0.57820009]
input: (tapeworm, '>', echinococcus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45642121  0.54357879]
input: (caterpillar, '>', cankerworm)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4777071  0.5222929]
input: (caterpillar, '>', webworm)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47274521  0.52725479]
input: (grub, '>', maggot)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.32750701  0.67249299]
input: (larva, '>', wiggler)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40485879  0.59514121]
input: (male, '>', colt)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41629921  0.58370079]
input: (racer, '>', finisher)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48193384  0.51806616]
input: (pangolin, '>', salamander)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58986846  0.41013154]
input: (chelonian, '>', ganoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60204224  0.39795776]
input: (phoronid, '>', ganoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54601246  0.45398754]
input: (mylodontid, '>', ganoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60349338  0.39650662]
input: (basilisk, '>', sturgeon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55325532  0.44674468]
input: (adelie, '>', pipefish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56719101  0.43280899]
input: (geometrid, '>', catfish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55484271  0.44515729]
input: (sticktight, '>', sardine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50828877  0.49171123]
input: (louse, '>', sucker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56936519  0.43063481]
input: (bird, '>', eel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50950831  0.49049169]
input: (snake, '>', eel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56162946  0.43837054]
input: (schistosome, '>', cod)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54887911  0.45112089]
input: (wireworm, '>', cod)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67555162  0.32444838]
input: (embryo, '>', salmon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65682469  0.34317531]
input: (oscine, '>', salmon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59208885  0.40791115]
input: (muntjac, '>', smelt)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51204254  0.48795746]
input: (zoophyte, '>', tarpon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51309067  0.48690933]
input: (puppy, '>', whitefish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55666257  0.44333743]
input: (parakeet, '>', prickleback)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6096599  0.3903401]
input: (puffin, '>', jack)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65762121  0.34237879]
input: (bull, '>', jack)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62254112  0.37745888]
input: (schipperke, '>', jack)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54752474  0.45247526]
input: (keeshond, '>', jack)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63542071  0.36457929]
input: (octopod, '>', cichlid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50225617  0.49774383]
input: (passerine, '>', mullet)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50167969  0.49832031]
input: (affenpinscher, '>', pike)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55036702  0.44963298]
input: (botfly, '>', porgy)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55158491  0.44841509]
input: (spawner, '>', snapper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65558835  0.34441165]
input: (quagga, '>', wrasse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.7063991  0.2936009]
input: (swiftlet, '>', filefish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52085793  0.47914207]
input: (macaw, '>', sculpin)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64676365  0.35323635]
input: (mackerel, '>', silversides)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.75241649  0.24758351]
input: (millipede, '>', ray)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64440599  0.35559401]
input: (dromaeosaur, '>', sardine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65423955  0.34576045]
input: (ceratodus, '>', snapper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.548448  0.451552]
input: (housefly, '>', snapper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56188554  0.43811446]
input: (cock, '>', auk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.69777429  0.30222571]
input: (metazoan, '>', auk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50918624  0.49081376]
input: (aurochs, '>', auk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70710973  0.29289027]
input: (frog, '>', jaeger)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52273094  0.47726906]
input: (brachyuran, '>', gull)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55967223  0.44032777]
input: (leopardess, '>', petrel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50487073  0.49512927]
input: (hominoid, '>', gannet)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5890574  0.4109426]
input: (carnivore, '>', swan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70304984  0.29695016]
input: (harpy, '>', rail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52881138  0.47118862]
input: (gyrfalcon, '>', shorebird)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61391833  0.38608167]
input: (pachyderm, '>', plover)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56589753  0.43410247]
input: (piddock, '>', shorebird)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56700035  0.43299965]
input: (ceratosaur, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67367462  0.32632538]
input: (bovine, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55921622  0.44078378]
input: (billfish, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57180498  0.42819502]
input: (conch, '>', tattler)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5379045  0.4620955]
input: (blowfly, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58866316  0.41133684]
input: (auk, '>', snipe)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53000627  0.46999373]
input: (toucanet, '>', dowitcher)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53102427  0.46897573]
input: (cow, '>', shorebird)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62841037  0.37158963]
input: (dimetrodon, '>', duck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50927705  0.49072295]
input: (chipmunk, '>', merganser)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57301203  0.42698797]
input: (bluefin, '>', merganser)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57598949  0.42401051]
input: (budgerigar, '>', duck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57899576  0.42100424]
input: (margay, '>', duck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51469112  0.48530888]
input: (mara, '>', teal)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50970175  0.49029825]
input: (creeper, '>', teal)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54756139  0.45243861]
input: (marsupial, '>', duck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57494163  0.42505837]
input: (rodent, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68006174  0.31993826]
input: (staghound, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58309504  0.41690496]
input: (diapsida, '>', falcon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54893179  0.45106821]
input: (primate, '>', falcon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53482631  0.46517369]
input: (thecodont, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54711078  0.45288922]
input: (redshank, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57985039  0.42014961]
input: (weimaraner, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65226465  0.34773535]
input: (beagle, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53180397  0.46819603]
input: (zooplankton, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52465378  0.47534622]
input: (taipan, '>', goatsucker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52864182  0.47135818]
input: (mudder, '>', goatsucker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55054868  0.44945132]
input: (serin, '>', goatsucker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5042723  0.4957277]
input: (vermin, '>', dove)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51972881  0.48027119]
input: (boar, '>', cock)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61769931  0.38230069]
input: (woodcock, '>', hen)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54937999  0.45062001]
input: (zebra, '>', phasianid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50992058  0.49007942]
input: (flatworm, '>', megapode)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68500806  0.31499194]
input: (vaquita, '>', parrot)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53872121  0.46127879]
input: (quail, '>', lory)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52016405  0.47983595]
input: (hatchling, '>', parrot)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56909216  0.43090784]
input: (hellbender, '>', parrot)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60868155  0.39131845]
input: (pterosaur, '>', passerine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59280205  0.40719795]
input: (oviraptorid, '>', oscine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51154871  0.48845129]
input: (blackbuck, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55001581  0.44998419]
input: (deinonychus, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67230368  0.32769632]
input: (vervet, '>', bunting)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66922447  0.33077553]
input: (moa, '>', bunting)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56624286  0.43375714]
input: (tuatara, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5269566  0.4730434]
input: (ormer, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52558855  0.47441145]
input: (abalone, '>', grosbeak)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6223373  0.3776627]
input: (struthiomimus, '>', towhee)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67011882  0.32988118]
input: (saturniid, '>', starling)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.69083606  0.30916394]
input: (waxwing, '>', swallow)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5306061  0.4693939]
input: (hamster, '>', thrush)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55542852  0.44457148]
input: (leporid, '>', thrush)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52493111  0.47506889]
input: (ratite, '>', titmouse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66727244  0.33272756]
input: (ichthyosaur, '>', warbler)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59755929  0.40244071]
input: (kob, '>', weaver)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59787331  0.40212669]
input: (tetrapod, '>', weaver)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.81179567  0.18820433]
input: (lagomorph, '>', weaver)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64382734  0.35617266]
input: (gastropod, '>', passerine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50776503  0.49223497]
input: (narwhal, '>', woodpecker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54531503  0.45468497]
input: (filaria, '>', fetus)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5549434  0.4450566]
input: (tapeworm, '>', phalanger)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5234185  0.4765815]
input: (gomphothere, '>', marsupial)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50453813  0.49546187]
input: (crappie, '>', buck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59570382  0.40429618]
input: (ornithomimid, '>', placental)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52000678  0.47999322]
input: (tyrannosaur, '>', coonhound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59476963  0.40523037]
input: (blastula, '>', hound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50442934  0.49557066]
input: (wirehair, '>', leopard)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51032548  0.48967452]
input: (tigress, '>', leopard)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58170899  0.41829101]
input: (rudderfish, '>', tom)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55201756  0.44798244]
input: (pleurodont, '>', wildcat)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52384126  0.47615874]
input: (deer, '>', lynx)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64462592  0.35537408]
input: (protoceratops, '>', wildcat)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70773347  0.29226653]
input: (cow, '>', marten)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56218722  0.43781278]
input: (lioness, '>', civet)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59960469  0.40039531]
input: (dinosaur, '>', placental)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50191264  0.49808736]
input: (odonate, '>', leporid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54362513  0.45637487]
input: (glowworm, '>', lagomorph)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55962815  0.44037185]
input: (theropod, '>', australopithecine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68159219  0.31840781]
input: (eoraptor, '>', baboon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52550512  0.47449488]
input: (tarsier, '>', colobus)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.69644531  0.30355469]
input: (minnow, '>', macaque)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51900207  0.48099793]
input: (constrictor, '>', elephant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66213936  0.33786064]
input: (metatherian, '>', squirrel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61315328  0.38684672]
input: (homeotherm, '>', placental)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51704026  0.48295974]
input: (instar, '>', antelope)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56045774  0.43954226]
input: (jaguarundi, '>', waterbuck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56233271  0.43766729]
input: (mammal, '>', bovine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55616294  0.44383706]
input: (ladyfish, '>', cattle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50946523  0.49053477]
input: (squid, '>', beef)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50485758  0.49514242]
input: (hog, '>', cattle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62734208  0.37265792]
input: (arthropod, '>', deer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58096172  0.41903828]
input: (bovid, '>', ass)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68721  0.31279]
input: (oligochaete, '>', stallion)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5565948  0.4434052]
input: (scorpion, '>', warhorse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59035851  0.40964149]
input: (whale, '>', turtle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51912684  0.48087316]
input: (reptile, '>', ceratopsian)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.78961097  0.21038903]
input: (longhorn, '>', ornithischian)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57030102  0.42969898]
input: (predator, '>', maniraptor)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51963104  0.48036896]
input: (hexapod, '>', maniraptor)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55810285  0.44189715]
input: (mamba, '>', maniraptor)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51360635  0.48639365]
input: (arachnid, '>', diapsid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67463965  0.32536035]
input: (bloodworm, '>', whiptail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66826614  0.33173386]
input: (saurian, '>', elapid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55553058  0.44446942]
input: (tardigrade, '>', rattlesnake)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52931126  0.47068874]
input: (wirehair, '>', terrier)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5065835  0.4934165]
input: (schnauzer, '>', dog)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54592588  0.45407412]
input: (cattle, '>', spitz)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57265394  0.42734606]
input: (elapid, '>', watchdog)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68780364  0.31219636]
input: (oystercatcher, '>', watchdog)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6357634  0.3642366]
input: (spider, '>', female)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.73321008  0.26678992]
input: (ichthyosaurus, '>', female)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62906478  0.37093522]
input: (young, '>', quail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54592041  0.45407959]
input: (binturong, '>', acarine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57085318  0.42914682]
input: (greenfly, '>', firefly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.71850438  0.28149562]
input: (livestock, '>', calosoma)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53268935  0.46731065]
input: (decapod, '>', beetle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54212831  0.45787169]
input: (ungulata, '>', fly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60626135  0.39373865]
input: (mite, '>', blowfly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56087931  0.43912069]
input: (pronghorn, '>', fly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60558145  0.39441855]
input: (ferret, '>', gadfly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55608208  0.44391792]
input: (kinkajou, '>', bee)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6026217  0.3973783]
input: (chestnut, '>', bee)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54454136  0.45545864]
input: (harrier, '>', wasp)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60823776  0.39176224]
input: (kuvasz, '>', butterfly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50882837  0.49117163]
input: (basenji, '>', danaid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56898132  0.43101868]
input: (larva, '>', fritillary)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65268071  0.34731929]
input: (carnivore, '>', nymphalid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52903381  0.47096619]
input: (hadrosaur, '>', butterfly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52450384  0.47549616]
input: (parrot, '>', tineoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52654759  0.47345241]
input: (therapsid, '>', worker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55157835  0.44842165]
input: (cynodont, '>', invertebrate)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53093888  0.46906112]
input: (cephalochordate, '>', hydrozoan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53411602  0.46588398]
input: (gadfly, '>', hydrozoan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59454541  0.40545459]
input: (nothosaur, '>', jellyfish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.7550655  0.2449345]
input: (scyphozoan, '>', coelenterate)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53405004  0.46594996]
input: (capybara, '>', quahog)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52427678  0.47572322]
input: (pithecanthropus, '>', decapod)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51814401  0.48185599]
input: (jellyfish, '>', seasnail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64243699  0.35756301]
input: (puppy, '>', annelid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50020561  0.49979439]
input: (biped, '>', racer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59763424  0.40236576]
input: (rabbitfish, '>', racer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63118995  0.36881005]
input: (pet, '>', young)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57851434  0.42148566]
input: (bobwhite, '>', calf)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5081594  0.4918406]
input: (grade, '>', foal)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57501831  0.42498169]
input: (coonhound, '>', foal)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52868649  0.47131351]

--  REPORT  --
             precision    recall  f1-score   support

   isParent       0.86      0.89      0.87      1276
  notParent       0.89      0.85      0.87      1276

avg / total       0.87      0.87      0.87      2552

We can find here 3 sources of incorrect prediction:

  • Wrong prediction
  • Words having several meaning (which is not supported by word2vec)
    • ray, '>', skate
    • ...
  • Not direct parent but in correct taxonomy chain
    • reptile, '>', ceratopsian
    • ...

The third case is especially interresting:

It translate a difference in the taxonmy ground truth in wordnet and the one learned by the classifier.
Since the vocabulary here is really specific, it may not occurs many times in the corpus, this raise the following question:

What taxonomy could we build (and how different) with a dedicated corpus ?

Belonging to chain

Results


In [32]:
belongDf.sort_values('f1', ascending=False)[:10]


Out[32]:
full strict clf feature post precision recall f1
36 animalAll strict RandomForestClassifier concatAngular noPost 0.971 0.971 0.971
38 animalAll strict RandomForestClassifier concatCarth noPost 0.971 0.971 0.971
14 animalAll RandomForestClassifier concatCarth noPost 0.969 0.969 0.969
40 animalAll strict RandomForestClassifier concatPolar noPost 0.968 0.968 0.968
37 animalAll strict RandomForestClassifier concatAngular postNormalize 0.967 0.967 0.967
39 animalAll strict RandomForestClassifier concatCarth postNormalize 0.965 0.965 0.965
15 animalAll RandomForestClassifier concatCarth postNormalize 0.963 0.963 0.963
41 animalAll strict RandomForestClassifier concatPolar postNormalize 0.963 0.963 0.963
12 animalAll RandomForestClassifier concatAngular noPost 0.961 0.961 0.961
16 animalAll RandomForestClassifier concatPolar noPost 0.961 0.961 0.961

Thus, we can observe a really high f1-score on RandomForest with concatenation techniques and no normalisation.

it is ok to not have normalisation considering decision tree

Considering we have here a much bigger dataset, it may also include 'almost' redondant data ie: feline > cat, feline > jungle cat,
are we overfitting despite of cross validation ?

Substraction techniques also provide good results.

Study errors

Here is the detail of:

  • False positive - ie: incorrect chain
  • False negative - ie: not chain belonging detected

In [34]:
!python ../../toolbox/script/detailConceptPairClfError.py ../../data/voc/npy/wikiEn-skipgram.npy ../../data/learnedModel/taxo/animalAll_strict_RandomForestClassifier_concatAngular_noPost.dill ../../data/wordPair/wordnetTaxo_animal_all.txt isParent ../../data/wordPair/wordnetTaxo_animal_fake.txt notParent


1388424 loaded from wikiEn-skipgram
mem usage 1.6GiB
loaded time 2.1897380352 s
input: (cyprinid, '>', silverfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4502493  0.5497507]
input: (topminnow, '>', mollie)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49513321  0.50486679]
input: (perch, '>', walleye)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45020875  0.54979125]
input: (pike, '>', pickerel)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49210614  0.50789386]
input: (billfish, '>', spearfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48102727  0.51897273]
input: (grouper, '>', coney)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47348371  0.52651629]
input: (snapper, '>', yellowtail)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47417262  0.52582738]
input: (wrasse, '>', bluehead)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.38904442  0.61095558]
input: (remora, '>', sharksucker)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47041175  0.52958825]
input: (poacher, '>', alligatorfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4162468  0.5837532]
input: (sculpin, '>', bullhead)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48642357  0.51357643]
input: (silversides, '>', jacksmelt)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37747154  0.62252846]
input: (ray, '>', guitarfish)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48394773  0.51605227]
input: (ray, '>', stingray)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47751349  0.52248651]
input: (grouper, '>', coney)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47348371  0.52651629]
input: (snapper, '>', yellowtail)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47417262  0.52582738]
input: (guillemot, '>', murre)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.30093941  0.69906059]
input: (swan, '>', whooper)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43840416  0.56159584]
input: (heron, '>', bittern)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46144247  0.53855753]
input: (stork, '>', openbill)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.36593944  0.63406056]
input: (merganser, '>', goosander)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48173726  0.51826274]
input: (eagle, '>', ringtail)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49757947  0.50242053]
input: (vulture, '>', condor)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48000404  0.51999596]
input: (cock, '>', gamecock)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49727638  0.50272362]
input: (dove, '>', turtledove)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49943028  0.50056972]
input: (grosbeak, '>', hawfinch)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49422647  0.50577353]
input: (thrush, '>', blackbird)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47901174  0.52098826]
input: (thrush, '>', wheatear)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43508468  0.56491532]
input: (titmouse, '>', chickadee)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47894711  0.52105289]
input: (warbler, '>', gnatcatcher)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37384652  0.62615348]
input: (woodpecker, '>', sapsucker)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49955474  0.50044526]
input: (wallaby, '>', pademelon)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4638217  0.5361783]
input: (porpoise, '>', vaquita)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47055912  0.52944088]
input: (cur, '>', feist)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49818598  0.50181402]
input: (spitz, '>', samoyed)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43441657  0.56558343]
input: (lynx, '>', caracal)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44406444  0.55593556]
input: (ermine, '>', stoat)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39367086  0.60632914]
input: (civet, '>', binturong)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41252058  0.58747942]
input: (armadillo, '>', peba)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45181684  0.54818316]
input: (insectivore, '>', hedgehog)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40258503  0.59741497]
input: (hare, '>', leveret)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41787744  0.58212256]
input: (pachyderm, '>', gomphothere)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48430006  0.51569994]
input: (pachyderm, '>', mammoth)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.34589535  0.65410465]
input: (ape, '>', siamang)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4096729  0.5903271]
input: (australopithecine, '>', paranthropus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.3915076  0.6084924]
input: (hominoid, '>', proconsul)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44531394  0.55468606]
input: (marmoset, '>', tamarin)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39545306  0.60454694]
input: (baboon, '>', chacma)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49306699  0.50693301]
input: (macaque, '>', rhesus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41637487  0.58362513]
input: (proboscidean, '>', mastodon)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49960519  0.50039481]
input: (rodent, '>', capybara)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47704544  0.52295456]
input: (gerbil, '>', jird)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48948586  0.51051414]
input: (oryx, '>', gemsbok)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48519094  0.51480906]
input: (brahman, '>', zebu)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.37406326  0.62593674]
input: (bovine, '>', cattle)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40332432  0.59667568]
input: (ox, '>', aurochs)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45717548  0.54282452]
input: (goat, '>', kid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46881962  0.53118038]
input: (wether, '>', bellwether)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4344865  0.5655135]
input: (racehorse, '>', pacer)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45230984  0.54769016]
input: (workhorse, '>', shire)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49616444  0.50383556]
input: (chelonian, '>', tortoise)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49595122  0.50404878]
input: (reptile, '>', diapsid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.36923807  0.63076193]
input: (ceratosaur, '>', coelophysis)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46934411  0.53065589]
input: (ichthyosaur, '>', ichthyosaurus)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41308282  0.58691718]
input: (therapsid, '>', dicynodont)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.40173763  0.59826237]
input: (cur, '>', feist)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49818598  0.50181402]
input: (spitz, '>', samoyed)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43441657  0.56558343]
input: (game, '>', phasianid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39053849  0.60946151]
input: (mite, '>', trombiculid)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48621024  0.51378976]
input: (acarine, '>', tick)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43371063  0.56628937]
input: (spider, '>', tarantula)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41319031  0.58680969]
input: (isopod, '>', woodlouse)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.4494484  0.5505516]
input: (beetle, '>', weevil)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41220759  0.58779241]
input: (mosquito, '>', gnat)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.48197905  0.51802095]
input: (flea, '>', chigoe)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.43641939  0.56358061]
input: (lycaenid, '>', blue)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49210732  0.50789268]
input: (lycaenid, '>', copper)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41620462  0.58379538]
input: (lycaenid, '>', hairstreak)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44026771  0.55973229]
input: (worker, '>', soldier)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.41270075  0.58729925]
input: (anthozoan, '>', coral)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49340941  0.50659059]
input: (anthozoan, '>', gorgonian)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.45870817  0.54129183]
input: (coral, '>', gorgonian)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.26735843  0.73264157]
input: (hydrozoan, '>', planula)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.46208677  0.53791323]
input: (hydrozoan, '>', siphonophore)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.42400844  0.57599156]
input: (clam, '>', teredo)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49258118  0.50741882]
input: (worm, '>', acanthocephalan)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.39321995  0.60678005]
input: (annelid, '>', earthworm)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49417104  0.50582896]
input: (oligochaete, '>', earthworm)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.3971831  0.6028169]
input: (caterpillar, '>', cankerworm)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.44874347  0.55125653]
input: (pest, '>', vermin)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.47882595  0.52117405]
input: (predator, '>', carnivore)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.1096551  0.8903449]
input: (racer, '>', greyhound)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.32721279  0.67278721]
input: (racer, '>', steeplechaser)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.49039011  0.50960989]
input: (pup, '>', puppy)  /  predicted: notParent  /  true: isParent  /  proba:[ 0.33611467  0.66388533]
input: (vertebrate, '>', frog)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.92052145  0.07947855]
input: (trombiculid, '>', salamander)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5523919  0.4476081]
input: (chelonian, '>', ganoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54314269  0.45685731]
input: (adelie, '>', pipefish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50777785  0.49222215]
input: (invertebrate, '>', topminnow)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.98200622  0.01799378]
input: (silkworm, '>', eel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56740679  0.43259321]
input: (bird, '>', eel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.89661101  0.10338899]
input: (snake, '>', eel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60571311  0.39428689]
input: (schistosome, '>', cod)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51546078  0.48453922]
input: (wireworm, '>', cod)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70684097  0.29315903]
input: (sole, '>', hake)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50681198  0.49318802]
input: (chow, '>', ribbonfish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62715704  0.37284296]
input: (oscine, '>', salmon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.94145273  0.05854727]
input: (puppy, '>', whitefish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66029049  0.33970951]
input: (architeuthis, '>', butterfish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60933  0.39067]
input: (plover, '>', jack)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51276598  0.48723402]
input: (octopod, '>', cichlid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53757073  0.46242927]
input: (sauropod, '>', grunt)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59455304  0.40544696]
input: (bull, '>', grunt)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5682822  0.4317178]
input: (passerine, '>', mullet)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.86100401  0.13899599]
input: (skate, '>', mullet)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55216063  0.44783937]
input: (corbina, '>', robalo)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51496314  0.48503686]
input: (hominid, '>', scombroid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59088643  0.40911357]
input: (cardigan, '>', mackerel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54655822  0.45344178]
input: (lugworm, '>', tuna)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58965317  0.41034683]
input: (massasauga, '>', porgy)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55375746  0.44624254]
input: (botfly, '>', porgy)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53091963  0.46908037]
input: (synapsid, '>', grouper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55771909  0.44228091]
input: (canine, '>', snapper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.81393682  0.18606318]
input: (sinanthropus, '>', wrasse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50987877  0.49012123]
input: (swiftlet, '>', filefish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63252727  0.36747273]
input: (insectivore, '>', remora)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52175669  0.47824331]
input: (peacock, '>', poacher)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57266588  0.42733412]
input: (antelope, '>', rockfish)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.74187623  0.25812377]
input: (mackerel, '>', silversides)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53977606  0.46022394]
input: (millipede, '>', ray)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58253066  0.41746934]
input: (acanthocephalan, '>', shark)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56788596  0.43211404]
input: (termite, '>', hammerhead)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58969663  0.41030337]
input: (herbivore, '>', salmon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68620342  0.31379658]
input: (housefly, '>', snapper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52003686  0.47996314]
input: (buteonine, '>', tuna)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58212093  0.41787907]
input: (bluetick, '>', hummingbird)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55962357  0.44037643]
input: (steed, '>', swift)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6676797  0.3323203]
input: (newt, '>', gallinule)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61786909  0.38213091]
input: (metazoan, '>', auk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53729305  0.46270695]
input: (frog, '>', jaeger)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60442621  0.39557379]
input: (eelworm, '>', larid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61870123  0.38129877]
input: (brachyuran, '>', gull)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6117311  0.3882689]
input: (mew, '>', gull)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56378566  0.43621434]
input: (leopardess, '>', petrel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54566194  0.45433806]
input: (kea, '>', grebe)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55346976  0.44653024]
input: (carnivore, '>', swan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.95904888  0.04095112]
input: (andrena, '>', swan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54065008  0.45934992]
input: (stud, '>', swan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60745477  0.39254523]
input: (bonito, '>', swan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.77175546  0.22824454]
input: (python, '>', heron)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50377425  0.49622575]
input: (hydrozoan, '>', heron)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59594992  0.40405008]
input: (fetus, '>', rail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56714355  0.43285645]
input: (pachyderm, '>', plover)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.7121002  0.2878998]
input: (mockingbird, '>', plover)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.76926678  0.23073322]
input: (cabbageworm, '>', plover)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56759688  0.43240312]
input: (mealworm, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56574229  0.43425771]
input: (bovine, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.74171137  0.25828863]
input: (billfish, '>', sandpiper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51280407  0.48719593]
input: (conch, '>', tattler)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64397904  0.35602096]
input: (auk, '>', snipe)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.73410342  0.26589658]
input: (toucanet, '>', dowitcher)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54725719  0.45274281]
input: (armyworm, '>', stilt)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56428517  0.43571483]
input: (cow, '>', shorebird)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.7542971  0.2457029]
input: (monarch, '>', stork)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63893403  0.36106597]
input: (redtail, '>', stork)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55167784  0.44832216]
input: (chipmunk, '>', merganser)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63360265  0.36639735]
input: (redstart, '>', sheldrake)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50824504  0.49175496]
input: (creeper, '>', teal)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57548788  0.42451212]
input: (marsupial, '>', duck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62627182  0.37372818]
input: (pheasant, '>', goose)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59324037  0.40675963]
input: (greylag, '>', eagle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59755216  0.40244784]
input: (rodent, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.84087939  0.15912061]
input: (chinook, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57154953  0.42845047]
input: (tanager, '>', falcon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51388577  0.48611423]
input: (diapsida, '>', falcon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53832881  0.46167119]
input: (primate, '>', falcon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.91088346  0.08911654]
input: (hawkmoth, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52431466  0.47568534]
input: (weimaraner, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51436891  0.48563109]
input: (beagle, '>', hawk)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52050998  0.47949002]
input: (chordate, '>', owl)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.98622275  0.01377725]
input: (owlet, '>', turtledove)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59966195  0.40033805]
input: (boar, '>', cock)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56450903  0.43549097]
input: (archosaur, '>', chicken)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64844865  0.35155135]
input: (woodcock, '>', hen)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56812105  0.43187895]
input: (tuna, '>', hen)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.71908005  0.28091995]
input: (velociraptor, '>', hen)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5753255  0.4246745]
input: (swordfish, '>', pheasant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6649428  0.3350572]
input: (diplodocus, '>', pheasant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54155167  0.45844833]
input: (bluebottle, '>', peafowl)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51226966  0.48773034]
input: (gemsbok, '>', quail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55614087  0.44385913]
input: (hellbender, '>', parrot)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54787908  0.45212092]
input: (wildcat, '>', parrot)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5119527  0.4880473]
input: (blackbuck, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54239391  0.45760609]
input: (deinonychus, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53749911  0.46250089]
input: (tuatara, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52532286  0.47467714]
input: (turtle, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63146759  0.36853241]
input: (ox, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51268157  0.48731843]
input: (abalone, '>', grosbeak)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58567176  0.41432824]
input: (annelid, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5298278  0.4701722]
input: (chigoe, '>', finch)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50512684  0.49487316]
input: (saturniid, '>', starling)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61710692  0.38289308]
input: (waxwing, '>', swallow)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54925559  0.45074441]
input: (pollard, '>', nightingale)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61784005  0.38215995]
input: (leporid, '>', thrush)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60805634  0.39194366]
input: (ratite, '>', titmouse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70638046  0.29361954]
input: (sawfish, '>', titmouse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60208318  0.39791682]
input: (ling, '>', kinglet)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59753091  0.40246909]
input: (amphibian, '>', oscine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58540129  0.41459871]
input: (kob, '>', weaver)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54150419  0.45849581]
input: (tetrapod, '>', weaver)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54315111  0.45684889]
input: (lagomorph, '>', weaver)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56538054  0.43461946]
input: (gastropod, '>', passerine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52180972  0.47819028]
input: (geoduck, '>', cotinga)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5235378  0.4764622]
input: (tetra, '>', woodpecker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62532849  0.37467151]
input: (protoavis, '>', moa)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52714512  0.47285488]
input: (filaria, '>', fetus)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58204342  0.41795658]
input: (tapeworm, '>', phalanger)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6285145  0.3714855]
input: (bumblebee, '>', rorqual)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.64712086  0.35287914]
input: (crappie, '>', buck)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51950314  0.48049686]
input: (damselfly, '>', stag)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50887401  0.49112599]
input: (skipjack, '>', corgi)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6854293  0.3145707]
input: (sheep, '>', corgi)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.74690968  0.25309032]
input: (placental, '>', cur)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.98228339  0.01771661]
input: (tyrannosaur, '>', coonhound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62284037  0.37715963]
input: (saurischian, '>', hound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.85027505  0.14972495]
input: (duckling, '>', greyhound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50094602  0.49905398]
input: (weka, '>', wolfhound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58243478  0.41756522]
input: (bonito, '>', spaniel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53431257  0.46568743]
input: (schoolmaster, '>', spitz)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51781077  0.48218923]
input: (partridge, '>', pinscher)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58743709  0.41256291]
input: (lizard, '>', fox)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.82082243  0.17917757]
input: (ani, '>', coyote)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50882456  0.49117544]
input: (tamarin, '>', lion)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55881295  0.44118705]
input: (rudderfish, '>', tom)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58758332  0.41241668]
input: (stilt, '>', cat)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52394774  0.47605226]
input: (deer, '>', lynx)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61207875  0.38792125]
input: (greenshank, '>', marten)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53138961  0.46861039]
input: (cow, '>', marten)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57888696  0.42111304]
input: (robin, '>', procyonid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56757224  0.43242776]
input: (goat, '>', procyonid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51004747  0.48995253]
input: (coscoroba, '>', armadillo)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54476857  0.45523143]
input: (pheasant, '>', armadillo)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63424869  0.36575131]
input: (odonate, '>', leporid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55758359  0.44241641]
input: (mollusk, '>', rabbit)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.69047921  0.30952079]
input: (keeshond, '>', rabbit)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61208138  0.38791862]
input: (echinoderm, '>', elephant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66923419  0.33076581]
input: (theropod, '>', australopithecine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67003601  0.32996399]
input: (edmontosaurus, '>', hominoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6320154  0.3679846]
input: (migrator, '>', marmoset)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53324581  0.46675419]
input: (tarsier, '>', colobus)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.81886898  0.18113102]
input: (cisco, '>', guenon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50597625  0.49402375]
input: (pierid, '>', guenon)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55296404  0.44703596]
input: (ruff, '>', langur)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55271161  0.44728839]
input: (minnow, '>', macaque)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55953033  0.44046967]
input: (weimaraner, '>', cavy)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6747386  0.3252614]
input: (coelenterate, '>', rodent)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51734822  0.48265178]
input: (insect, '>', rodent)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.73987315  0.26012685]
input: (peafowl, '>', dormouse)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51167685  0.48832315]
input: (sidewinder, '>', gerbil)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53759198  0.46240802]
input: (mutant, '>', marmot)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.8113754  0.1886246]
input: (metatherian, '>', squirrel)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.74943069  0.25056931]
input: (cattalo, '>', antelope)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59725148  0.40274852]
input: (vizsla, '>', oryx)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50712169  0.49287831]
input: (lynx, '>', brahman)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55777811  0.44222189]
input: (mammal, '>', bovine)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.90255373  0.09744627]
input: (porcupine, '>', beef)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50224896  0.49775104]
input: (squid, '>', beef)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59958116  0.40041884]
input: (worm, '>', beef)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.65625855  0.34374145]
input: (centipede, '>', cattle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61140345  0.38859655]
input: (bitch, '>', ox)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.519529  0.480471]
input: (eurypterid, '>', ox)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51319536  0.48680464]
input: (holometabola, '>', chevrotain)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50885468  0.49114532]
input: (beef, '>', chevrotain)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52249077  0.47750923]
input: (cephalopod, '>', deer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58012172  0.41987828]
input: (arthropod, '>', deer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.80594212  0.19405788]
input: (bovid, '>', ass)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.77362946  0.22637054]
input: (harvestman, '>', mare)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50613702  0.49386298]
input: (basset, '>', pony)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5221386  0.4778614]
input: (steeplechaser, '>', pony)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.66664278  0.33335722]
input: (nester, '>', pony)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50757653  0.49242347]
input: (ascidian, '>', chelonian)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55888558  0.44111442]
input: (doliolum, '>', turtle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56522113  0.43477887]
input: (whale, '>', turtle)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.6021362  0.3978638]
input: (reptile, '>', ceratopsian)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.77180023  0.22819977]
input: (merostomata, '>', hadrosaur)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52729897  0.47270103]
input: (predator, '>', maniraptor)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.8215636  0.1784364]
input: (shearwater, '>', maniraptor)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53562588  0.46437412]
input: (hexapod, '>', maniraptor)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51469185  0.48530815]
input: (dog, '>', ichthyosaur)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70884151  0.29115849]
input: (africander, '>', iguanid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52603886  0.47396114]
input: (kite, '>', boa)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54774838  0.45225162]
input: (leatherjacket, '>', boa)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53721433  0.46278567]
input: (spitz, '>', elapid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53498687  0.46501313]
input: (saurian, '>', elapid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.71497053  0.28502947]
input: (bobcat, '>', viper)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56530856  0.43469144]
input: (aegyptopithecus, '>', rattlesnake)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61769219  0.38230781]
input: (tardigrade, '>', rattlesnake)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54498394  0.45501606]
input: (goldfinch, '>', rattlesnake)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57903689  0.42096311]
input: (diamondback, '>', corgi)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.70031781  0.29968219]
input: (twitterer, '>', coonhound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54391756  0.45608244]
input: (earthworm, '>', greyhound)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57595957  0.42404043]
input: (crustacean, '>', terrier)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.75563765  0.24436235]
input: (edaphosaurus, '>', terrier)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56702819  0.43297181]
input: (cattle, '>', spitz)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.77175191  0.22824809]
input: (elapid, '>', watchdog)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.60537823  0.39462177]
input: (fox, '>', tom)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.75180363  0.24819637]
input: (cygnet, '>', blastula)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50839897  0.49160103]
input: (planula, '>', pheasant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59931413  0.40068587]
input: (cinnabar, '>', pheasant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54768798  0.45231202]
input: (entoproct, '>', pheasant)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5652202  0.4347798]
input: (murine, '>', peafowl)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61347223  0.38652777]
input: (bison, '>', peafowl)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.63620677  0.36379323]
input: (young, '>', quail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.75391268  0.24608732]
input: (giraffe, '>', mite)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55316096  0.44683904]
input: (elephant, '>', isopod)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50389259  0.49610741]
input: (livestock, '>', calosoma)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61272288  0.38727712]
input: (mite, '>', blowfly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.67627156  0.32372844]
input: (cherrystone, '>', gadfly)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52700776  0.47299224]
input: (ocelot, '>', gnat)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58748882  0.41251118]
input: (sharksucker, '>', gnat)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55491141  0.44508859]
input: (feline, '>', mosquito)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.59598974  0.40401026]
input: (doodlebug, '>', vespid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.62832361  0.37167639]
input: (larva, '>', fritillary)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.73521032  0.26478968]
input: (carnivore, '>', nymphalid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.95662369  0.04337631]
input: (parrot, '>', tineoid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54432227  0.45567773]
input: (therapsid, '>', worker)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51291158  0.48708842]
input: (heifer, '>', coral)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52193697  0.47806303]
input: (cephalochordate, '>', hydrozoan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.51848005  0.48151995]
input: (gadfly, '>', hydrozoan)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.54123624  0.45876376]
input: (proboscidean, '>', clam)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.73716351  0.26283649]
input: (diapsid, '>', shipworm)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.83476595  0.16523405]
input: (mule, '>', oyster)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53072215  0.46927785]
input: (jellyfish, '>', seasnail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.55836263  0.44163737]
input: (cassowary, '>', seasnail)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50630393  0.49369607]
input: (ungulate, '>', annelid)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.88886438  0.11113562]
input: (pacer, '>', fluke)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.61858384  0.38141616]
input: (mapinguari, '>', nematode)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.52489901  0.47510099]
input: (bat, '>', caterpillar)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50531171  0.49468829]
input: (hedgehog, '>', grub)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56304973  0.43695027]
input: (viscacha, '>', male)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.56158715  0.43841285]
input: (trachodon, '>', stallion)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.58859021  0.41140979]
input: (maleo, '>', racer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.5109037  0.4890963]
input: (biped, '>', racer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.68257378  0.31742622]
input: (rabbitfish, '>', racer)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.57559223  0.42440777]
input: (bullterrier, '>', lamb)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.50620563  0.49379437]
input: (nymphalid, '>', lamb)  /  predicted: isParent  /  true: notParent  /  proba:[ 0.53608437  0.46391563]

--  REPORT  --
             precision    recall  f1-score   support

   isParent       0.98      0.99      0.98     10462
  notParent       0.92      0.80      0.86      1276

avg / total       0.97      0.97      0.97     11738

We can find here 2 sources of incorrect prediction:

  • Wrong prediction
  • Words having several meaning (which is not supported by word2vec)
    • hedgehog, '>', grub
    • ...
  • Challenges to wordnet taxonomy
    • proboscidean, '>', clam (proboscidean are defined as having a trunk, so the clam)
    • ...

On the third case we also observe a difference in the taxonmy ground truth in wordnet and the one learned by the classifier.

Conclusion

The recognition rate is quite satisfying here considering the basic model we use. More advanced techniques could improve the results.

It is interresting to notice false predictions are also challenging the ground truth.
Unlike antonyms, a domain specific human expert would probably be able to decide.

As a side note, having several meaning for one word introduce noise, showing some limitations of word2vec model.


Further exploring could be:

  • Is all the chain covered by the belonging model ? Is so, can we rebuild the full chain based on prediction confidence ?
  • Apply the animal trained model to a different domain.
  • Use a hierarchical clustering algorithm.
  • Use a graph similarity measure to quantify the global error with the wordnet taxonomy.
  • Find a feature to use sub graphs as dataset.