lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Boolean retrieval
Date Tue, 07 Jul 2009 09:39:54 GMT

Seems a long-winded way of producing a BooleanFilter but I guess you are trying to work with
user input in the form of query strings.

The bug in your code is that clause.getQuery().getString() is not producing terms that are
in your index - the first call to getTermsFilter passes the string  "+f1:aaa +f1:bbb" which
is not a term in the index.

Given the requirement is to ignore scoring I would recommend (as someone else suggested) looking
at the IndexSearch.search method that takes a HitCollector and simply accumulate all results,
regardless of score.





----- Original Message ----
From: Lukas Michelbacher <mmmasterluke@gmail.com>
To: java-user@lucene.apache.org
Sent: Tuesday, 7 July, 2009 9:53:24
Subject: Re: Boolean retrieval

To test my Boolean queries, I have a small test collection where each document
contains one of 1024 possible combinations of the strings "aaa", "bbb",
... "jjj".  I tried wrapping a Boolean query like this (it's based on an
older post to this list [1])


private static TermsFilter getTermsFilter(String field, String text) {
  TermsFilter tf = new TermsFilter();
  tf.addTerm(new Term(field, text));
  return tf;
}

Query q = new QueryParser("f1", new StandardAnalyzer()).parse("(aaa
AND bbb) OR ccc");
IndexSearcher searcher = new IndexSearcher(indexDir);
TopDocCollector collector = new TopDocCollector(1024);

BooleanQuery bc = (BooleanQuery) q;
BooleanFilter finalFilter = new BooleanFilter();
BooleanFilter boolFilt = new BooleanFilter();

// add each clause of the original query to the filter
for (BooleanClause clause : bc.getClauses()) {
  boolFilt.add(new FilterClause(getTermsFilter("f1",
clause.getQuery().toString()), clause.getOccur()));
  System.out.println(clause.getQuery().toString());
}

finalFilter.add(new FilterClause(boolFilt, BooleanClause.Occur.MUST));

ConstantScoreQuery csq = new ConstantScoreQuery(finalFilter);
searcher.search(csq, finalFilter, collector);

ScoreDoc[] hits = collector.topDocs().scoreDocs;
System.out.println("Found " + collector.getTotalHits() + " hits");

The result is 0 hits (should be 640).

[1] tinyurl.com/ml52ye

2009/7/4 Mark Harwood <markharw00d@yahoo.co.uk>:
>
> Check out booleanfilter in contrib/queries. It can be wrapped in a constantScoreQuery
>
>
>
> On 4 Jul 2009, at 17:37, Lukas Michelbacher <michells@ims.uni-stuttgart.de> wrote:
>
>
> This is about an experiment comparing plain Boolean retrieval with
> vector-space-based retrieval.
>
> I would like to disable all of Lucene's scoring mechanisms and just
> run a true Boolean query that returns exactly the documents that match a
> query specified in Boolean syntax (OR, AND, NOT). No scoring or sorting
> required.
>
> As far as I can see, this is not supported out of the box.  Which classes
> would I have to modify?
>
> Would it be enough to create a subclass of Similarity and to ignore all terms but one
(coord, say) and make this term return 1 if the query matches the document and 0 otherwise?
>
> Lukas
>
> --
> Lukas Michelbacher
> Institute for Natural Language Processing
> Universit├Ąt Stuttgart
> email: michells@ims.uni-stuttgart.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message