opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark G <>
Subject TokenNameFinder and Span probs
Date Wed, 07 May 2014 00:18:28 GMT
I am currently working on a project in which we are using NER to to pass
toponyms into the GeoEntityLinker addon for geotagging and I am passing on
the locations, entities, and other info into SOLR for indexing. Over the
years I have noticed that the TokenNameFinder interface does not include
all the probs() methods that the NameFinderME has, and furthermore the Span
object does not have a double field for storing a prob for itself.  Also
the sentenceDetector has a method called getSentenceProbabilities rather
than probs().
When I pass the Spans into the GeoEntityLinker/EntityLinker I can't get the
probs anymore because they are not in the Span objects. I can always extend
Span and add the field, or keep a 2D array of the probs for each sentence,
but wanted to see what everyone thinks about
1. adding the probs methods to the TokenNameFinder interface
2. adding a prob field to Span (a double)
3. Having the NameFinder return the prob with each Span so it doesn't have
to be set after the call to find() using the double[] of probs
4. Have the sentencedetectorME return its spans with a prob, add probs()
method to the SentenceDetector interface, and deprecate the


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message