lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <>
Subject Re: TFIDF Implementation
Date Tue, 14 Dec 2004 22:36:49 GMT
Bruce Ritchie wrote:

>>>You can also see 'Books like this' example from here 
>>Well done, uses a term vector, instead of reparsing the orig 
>>doc, to form the similarity query. Also I like the way you 
>>exclude the source doc in the query, I didn't think of doing 
>>that in my code.
> I agree, it's a good way to exclude the source doc.
>>I don't trust calling vector.size() and vector.getTerms() 
>>within the loop but I haven't looked at the code to see if it 
>>calculates  the results each time or caches them...
> From the code I looked at, those calls don't recalculate on every call. 

I was referring to this fragment below from BooksLikeThis.docsLike(), 
and was mentioning it as the javadoc 
does not say that the values returned by size() and getTerms() are 
cached, and while the impl may cache them (haven't checked) it's not 
guarenteed, thus it's safer to put the size() and getTerms() call 
outside the loop.

  for (int j = 0; j < vector.size(); j++) {
       TermQuery tq = new TermQuery(
           new Term("subject", vector.getTerms()[j]));
> Regards,
> Bruce Ritchie
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message