lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Dollinger <>
Subject Classification / Change Scoring during search
Date Tue, 07 Mar 2006 16:07:34 GMT

I want to use Lucene to get similar documents based on a Boolean Query
(similar metadata with OR clauses) and ratings of the user for already
searched documents.

I intend to implement a Naive Bayes classifier to categorize documents
into liked/disliked classes and would do this by using a HitCollector class.

class ClassifyingHitCollector implements HitCollector {

  public void collect(int doc, float score) {
    // classify document

    // if document is liked -> add to hit collection



ClassifyingHitCollector c = new ClassifyingHitCollector ();, c);

This means that the calculation of the bayes classification has to be
calculated for each matching document. Is there a possibility to do this
(during search) for only the n top matching documents or does this mean
to use the Hits returning overload and do the
calculation on the n top matching documents, after the Lucene search?

Is there another possibility to change the scoring of the search(..)
method that is more efficient?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message