Comparison of the different conversion routines

First, import the relevant stuff.


In [5]:
require 'rspec'
require 'tcf2nif'
include Tcf2Nif


Out[5]:
Object

Vanilla NIF approach

Then create our sample documents and a transformer object.


In [6]:
@testfile = File.open(File.join('.', 'spec', 'assets', 'phantom.xml'), 'r')
@tcf_doc = TcfDocument.new(@testfile)
@trans = Transformer.new(@tcf_doc, {})

Now, for the vanilla approach, we do a basic conversion.


In [ ]:
graph = @trans.transform(:noprov)
puts graph.size

We can also run the noprov RSpec tasks.


In [ ]:
RSpec.configure do |config|
  config.filter_run_including :noprov => true
  config.run_all_when_everything_filtered = true
end
RSpec::Core::Runner::run(['spec'])

Plain NIF approach

This approach reifies some of the RDF in order to make annotations identifiable.

Finally, read TCF, transform to NIF, and show us the number of triples.


In [ ]:


In [ ]:

Pos Tagger Comparison

This section contains SPARQL stuff for comparing POS taggers. With plain NIF, a query could be something like this:

prefix xsd:  <http://www.w3.org/2001/XMLSchema#>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix owl:  <http://www.w3.org/2002/07/owl#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix nif:  <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
prefix penn: <http://purl.org/olia/penn.owl#>
prefix ex:   <http://example.org/tcf2nif/example.txt#>

SELECT ?begin ?end ?anchor ?pos WHERE {
  ?nif
    nif:beginIndex ?begin ;
    nif:endIndex ?end ;
    nif:beginIndex "4991"^^xsd:nonNegativeInteger ;
    nif:anchorOf ?anchor ;
    nif:oliaLink ?pos .
}

results in two answers: