lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramy Hardan <j...@hardan.de>
Subject Search Refinement Approaches
Date Sat, 07 Feb 2004 22:32:35 GMT
Hi,

Reviewing javadocs and previous posts, search refinement or 'search
within search' is best done with a Filter. To fill the Filter's BitSet
with the results of a search, a HitCollector is the obvious solution.
Unfortunately when using HitCollector I have to implement all the
functionality the Hits class usually provides myself.

Is there an efficient way to search refinement preferably without
losing the Hits class? I can think of the following approaches:

- Don't use Hits: collect all scores and document numbers with a
  HitCollector and sort them by score after the search. Retrieve the
  needed documents from IndexReader via document number.
- Use Hits: Briefly examining the source reveals this possiblilty:
  subclass BitSet and override the boolean get(int bitIndex) method to
  additionally set the bit at bitIndex in another BitSet. Use this
  subclass in a Filter and initialize it with all ones (in the first
  search). This way I can tell which documents are tested by the
  IndexSearcher against the Filter by examining the second BitSet and
  use it as a Filter for the refining search. Here's a scetch of this
  for clarification:

  public class FilterBitSet extends BitSet {
    private BitSet bitsForRefiningFilter;

    public boolean get( int bitIndex ) {
      boolean result = super.get( bitIndex );
      if (result) bitsForRefiningFilter.set( bitIndex );
      return result;
    }
  }

  Is this really possible? (might be more of a question for dev)

Last question about document numbers:
When and how exactly do they change? The javadoc states they change
upon addition and deletion. May I assume that a particular document
number is stable as long as it is not changed (deleted and added)
although other documents are added/deleted and optimize() is NOT
called? If yes, is this about to change in the foreseeable future?

Thanks in advance

Ramy

  



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message