opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Giaconia, Mark [USA]" <Giaconia_M...@bah.com>
Subject DocumentNameFinder (LinkableDocumentNameFinder)
Date Sun, 02 Jun 2013 20:40:22 GMT
As part of working the EntityLinker (issue OPENNLP-579<https://issues.apache.org/jira/browse/OPENNLP-579>),
I created a new Interface and a default impl
called LinkableDocumentNameFinder/DefaultLinkableDocumentNameFinderImpl.
Here are the method signatures for the Interface

public interface LinkableDocumentNameFinder{
  Document find(String[] sentences, Tokenizer tokenizer, List<TokenNameFinder> nameFinders,
boolean linkable);
  Document find(String documentText, SentenceDetector sentenceDetector, Tokenizer tokenizer,
List<TokenNameFinder> nameFinders, boolean linkable);
  Document find(List<Sentence> sentences, Tokenizer tokenizer, List<TokenNameFinder>
nameFinders, boolean linkable);
  Document find(Document document, SentenceDetector sentenceDetector, Tokenizer tokenizer,
List<TokenNameFinder> nameFinders, boolean linkable);
  List<Document> find(List<Document> documents, SentenceDetector sentenceDetector,
Tokenizer tokenizer, List<TokenNameFinder> nameFinders, boolean linkable);
}

notice the Document object return type... here is what a Document object looks like

public class Document{
 private List<Sentence> sentences = new ArrayList<>();
  public List<Sentence> getSentences()  {
    return sentences;
  }
  public void setSentences(List<Sentence> sentences)  {
    this.sentences = sentences;
  }
}

notice the Sentence object..... here it is:
public class Sentence{
  private String sentenceText;
  private Integer sentenceNumber;
  private List<String> tokens = new ArrayList<>();
  private List<Span> spans = new ArrayList<>();

  public Sentence(String sentenceText, Integer sentenceNumber)  {
    this.sentenceNumber = sentenceNumber;
    this.sentenceText = sentenceText;
  }
//setters...getters....
}


Mark Giaconia


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message