Comparison of the different conversion routines

First, import the relevant stuff.



In [5]:

    
require 'rspec'
require 'tcf2nif'
include Tcf2Nif









    Out[5]:





Object

Vanilla NIF approach

Then create our sample documents and a transformer object.



In [6]:

    
@testfile = File.open(File.join('.', 'spec', 'assets', 'phantom.xml'), 'r')
@tcf_doc = TcfDocument.new(@testfile)
@trans = Transformer.new(@tcf_doc, {})

Now, for the vanilla approach, we do a basic conversion.



In [ ]:

    
graph = @trans.transform(:noprov)
puts graph.size

We can also run the noprov RSpec tasks.



In [ ]:

    
RSpec.configure do |config|
  config.filter_run_including :noprov => true
  config.run_all_when_everything_filtered = true
end
RSpec::Core::Runner::run(['spec'])

Plain NIF approach

This approach reifies some of the RDF in order to make annotations identifiable.

Finally, read TCF, transform to NIF, and show us the number of triples.



In [ ]:



In [ ]:

Pos Tagger Comparison

This section contains SPARQL stuff for comparing POS taggers. With plain NIF, a query could be something like this:

prefix xsd:  <http://www.w3.org/2001/XMLSchema#>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix owl:  <http://www.w3.org/2002/07/owl#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix nif:  <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
prefix penn: <http://purl.org/olia/penn.owl#>
prefix ex:   <http://example.org/tcf2nif/example.txt#>

SELECT ?begin ?end ?anchor ?pos WHERE {
  ?nif
    nif:beginIndex ?begin ;
    nif:endIndex ?end ;
    nif:beginIndex "4991"^^xsd:nonNegativeInteger ;
    nif:anchorOf ?anchor ;
    nif:oliaLink ?pos .
}

results in two answers:

4991, 4996, judge, http://purl.org/olia/penn.owl#NN
4991, 4996, judge, http://purl.org/olia/penn.owl#VB