lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <ysee...@gmail.com>
Subject Re: FilteredQuery within BooleanQuery issue
Date Sat, 04 Mar 2006 05:59:19 GMT
This is the first time I've looked at FilteredQuery, but the scorer is
indeed flawed IMO.
next() and skipTo() simply iterate over the documents that match the
query, and just modify the score to return 0 if it doesn't match the
filter.

          public boolean next() throws IOException { return scorer.next(); }
          public boolean skipTo (int i) throws IOException { return
scorer.skipTo(i); }
          // if the document has been filtered out, set score to 0.0
          public float score() throws IOException {
            return (bitset.get(scorer.doc())) ? scorer.score() : 0.0f;
          }

The higher level search functions would return the correct results
since they filter out any documents with a score <= 0.

Check out LUCENE-330 for possible fixes.  (sorry, firefox is refusing
to paste the URL for me again...)

-Yonik


On 3/3/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> I've run into what I feel is an issue with FilteredQuery.  The best
> description is an example.  First I've indexed three documents:
>
>    public void setUp() throws IOException {
>      RAMDirectory directory = new RAMDirectory();
>      IndexWriter writer = new IndexWriter(directory, new
> WhitespaceAnalyzer(), true);
>      Document doc = new Document();
>      doc.add(new Field("field", "zero", Field.Store.YES,
> Field.Index.TOKENIZED));
>      writer.addDocument(doc);
>
>      doc = new Document();
>      doc.add(new Field("field", "one", Field.Store.YES,
> Field.Index.TOKENIZED));
>      writer.addDocument(doc);
>      writer.close();
>
>      doc = new Document();
>      doc.add(new Field("field", "two", Field.Store.YES,
> Field.Index.TOKENIZED));
>      writer.addDocument(doc);
>      writer.close();
>
>      searcher = new IndexSearcher(directory);
>    }
>
> Now for a mock filter to keep things simple:
>
> public class DummyFilter extends Filter {
>    private int doc;
>
>    public DummyFilter(int doc) {
>      this.doc = doc;
>    }
>
>
>    public BitSet bits(IndexReader reader) throws IOException {
>      BitSet bits = new BitSet(reader.maxDoc());
>      bits.set(doc);
>      return bits;
>    }
> }
>
> And finally a test case that fails:
>
>    public void testBoolean() throws Exception {
>      BooleanQuery bq = new BooleanQuery();
>      Query query = new FilteredQuery(new MatchAllDocsQuery(),
>          new DummyFilter(0));
>      bq.add(query, BooleanClause.Occur.MUST);
>      query = new FilteredQuery(new MatchAllDocsQuery(),
>          new DummyFilter(1));
>      bq.add(query, BooleanClause.Occur.MUST);
>      Hits hits = searcher.search(bq);
>      assertEquals(0, hits.length());  // fails: hits.length() == 2
>    }
>
> I expect no documents should match this BooleanQuery, but yet two
> documents match (id's 0 and 1).  Am I right in thinking that no
> documents should match since each required clause selects a different
> document so there is no intersection?  If so, what's the flaw in
> FilteredQuery that causes this?   If I'm wrong in my assertion, how so?
>
> For comparison, a ChainedFilter does do what I expect:
>
>    public void testChainedFilter() throws Exception {
>      ChainedFilter filter = new ChainedFilter(
>          new Filter[] {new DummyFilter(0), new DummyFilter(1)},
>          ChainedFilter.AND);
>      Hits hits = searcher.search(new MatchAllDocsQuery(), filter);
>      assertEquals(0, hits.length());  // passes
>    }
>
> Thanks,
>         Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message