lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukas Michelbacher <mmmasterl...@gmail.com>
Subject Re: Boolean retrieval
Date Tue, 07 Jul 2009 08:53:24 GMT
To test my Boolean queries, I have a small test collection where each document
contains one of 1024 possible combinations of the strings "aaa", "bbb",
... "jjj".  I tried wrapping a Boolean query like this (it's based on an
older post to this list [1])


private static TermsFilter getTermsFilter(String field, String text) {
  TermsFilter tf = new TermsFilter();
  tf.addTerm(new Term(field, text));
  return tf;
}

Query q = new QueryParser("f1", new StandardAnalyzer()).parse("(aaa
AND bbb) OR ccc");
IndexSearcher searcher = new IndexSearcher(indexDir);
TopDocCollector collector = new TopDocCollector(1024);

BooleanQuery bc = (BooleanQuery) q;
BooleanFilter finalFilter = new BooleanFilter();
BooleanFilter boolFilt = new BooleanFilter();

// add each clause of the original query to the filter
for (BooleanClause clause : bc.getClauses()) {
  boolFilt.add(new FilterClause(getTermsFilter("f1",
clause.getQuery().toString()), clause.getOccur()));
  System.out.println(clause.getQuery().toString());
}

finalFilter.add(new FilterClause(boolFilt, BooleanClause.Occur.MUST));

ConstantScoreQuery csq = new ConstantScoreQuery(finalFilter);
searcher.search(csq, finalFilter, collector);

ScoreDoc[] hits = collector.topDocs().scoreDocs;
System.out.println("Found " + collector.getTotalHits() + " hits");

The result is 0 hits (should be 640).

[1] tinyurl.com/ml52ye

2009/7/4 Mark Harwood <markharw00d@yahoo.co.uk>:
>
> Check out booleanfilter in contrib/queries. It can be wrapped in a constantScoreQuery
>
>
>
> On 4 Jul 2009, at 17:37, Lukas Michelbacher <michells@ims.uni-stuttgart.de> wrote:
>
>
> This is about an experiment comparing plain Boolean retrieval with
> vector-space-based retrieval.
>
> I would like to disable all of Lucene's scoring mechanisms and just
> run a true Boolean query that returns exactly the documents that match a
> query specified in Boolean syntax (OR, AND, NOT). No scoring or sorting
> required.
>
> As far as I can see, this is not supported out of the box.  Which classes
> would I have to modify?
>
> Would it be enough to create a subclass of Similarity and to ignore all terms but one
(coord, say) and make this term return 1 if the query matches the document and 0 otherwise?
>
> Lukas
>
> --
> Lukas Michelbacher
> Institute for Natural Language Processing
> Universität Stuttgart
> email: michells@ims.uni-stuttgart.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message