lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: fastest way to gather simple terms that match documents?
Date Mon, 05 Apr 2010 18:24:19 GMT

:     After I've run a query I need to know which terms matched each
: result document (ie doc termfrequency>0).
	...
: I don't care how many were found or what position or anything else.
: just which ones matched.

if all you care about is simple "which terms does it have" you can take 
your list of terms, and your list of docids, sort both lists and then use 
termDocs to loop over the terms and over the docs.  (the sorting is key 
for performance, because it allways you to alwasy skip forward, w/o 
needing to restart the termDocs)

something like...

TermDocs iter = indexReader.termDocs();
for (Term t : myTerms) {
  iter.seek(t);
  for (int docid : myDocs) {
    if (iter.skipTo(docid) && (iter.doc() == docid)) {
      doSomethingWith(t, docid);
    }
  }
}



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message