lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject RE: fastest way to gather simple terms that match documents?
Date Mon, 05 Apr 2010 19:33:10 GMT

: Alternatively index your documents with term vectors for the field enabled:
	...
: And then use IndexReader.getTermFreqVector() with the matching doc ID:

Uwe: this is an area i'm not particularly strong on, so i'm curious: do 
you expect that the TermFreqVector approach would be faster then the 
TermDocs approach for the type of usecase where docs tend to be "large"
but the list of specific terms you are interested in in testing for is 
"small" (ie: just the terms used in the original query)

I ask because off the top of my head i'm not seeing how it 
would really give you much of a time savings in return -- instead of 
seeking over the handful of terms you care about, the TermVectorMapper 
will have to scan over every Term in each of hte documents.  writing your 
own TermVectorMapper that ignores the terms you don't care about will 
help, but that still doesn't sound any faster)

: > :     After I've run a query I need to know which terms matched each
: > : result document (ie doc termfrequency>0).
: > 	...
: > : I don't care how many were found or what position or anything else.
: > : just which ones matched.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message