lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Deluca <decl...@gmail.com>
Subject Re: Combine WildcardQuerys
Date Wed, 23 Dec 2009 08:16:49 GMT
Hi Simon,
Thank you very much for your help. Search is working now.

Here is a code snipped of the working search implementation (with lucene
2.9.1).
*BooleanQuery theQuery = new BooleanQuery();

WildcardQuery **wcQuerySurname** = new WildcardQuery(**new Term("surname",
"foo"));
wcQuerySurname.setRewriteMethod(MultiTermQuery.CONSTANT_SCORE_FILTER_REWRITE);
theQuery.add(**wcQuerySurname.rewrite(indexReader));*

*WildcardQuery **wcQueryForename** = new WildcardQuery(**new Term("**
forename**", "b*"));
**wcQueryForename**
.setRewriteMethod(MultiTermQuery.CONSTANT_SCORE_FILTER_REWRITE);
theQuery.add(**wcQueryForename**.rewrite(indexReader)**)**;*

* ** theRewritten = indexSearcher.rewrite(theQuery);*

Thanks,
Claudio

2009/12/22 Simon Willnauer <simon.willnauer@googlemail.com>

> Claudio,
> I might have confused you a bit with my last reply so I will try to
> elaborate a little bit.
> Internally Lucene uses a method called Query#rewrite() that most
> likely rewrites a query into a "primitive" query like BooleanQuery
> which consists of TermQuery instances. This is, in the most cases,
> completely hidden from the API user and only directly called
> to solve very specific and likely advanced problems.
> Yet, in order to be scored lucene 2.4 WildcardQuery rewrites to a
> BooleanQuery that again consists of TermQueries containing each term
> in the index that matches the wildcard pattern passed to it. If you
> have hundreds or thousands of terms matching a wildcard ("b*" is
> likely to match a lot of terms in the index) Lucene will internal
> raise an exception that too many clauses in the BooleanQuery. (see
>
> http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/search/BooleanQuery.TooManyClauses.htmll
> for details).
>
> In Lucene 2.9 this can be handled differently with a specific rewrite
> method that does not rewrite to BooleanQuery but to for instance a
> ConstantScoreQuery(FilterQuery). To use this feature you need to
> upgrade to at least lucene 2.9. See
>
> http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/search/MultiTermQuery.html
> and friends for details.
>
> Simon
>
> On Tue, Dec 22, 2009 at 2:22 PM, Simon Willnauer
> <simon.willnauer@googlemail.com> wrote:
> > Hi Claudio,
> > your query setup is fine for what you trying to do but as you are
> > using wildcards lucene internally rewrites your wildcardquery again
> > into another boolean query containing every term starting with your
> > prefix "b". if you use such a small prefix lucene will likely create
> > tons of boolean clauses one for each term starting with your prefix.
> > Yet, booleanQuery has a limitation (maxclausecount) that will trigger
> > the exception you are facing once you hit the clause count limit. You
> > can try to raise the limit in booleanquery but you will very likely
> > end up with a bad search performance.
> > Lucene 2.9 provides alternative rewritemethods for multitermqueries
> > like WildCardQuery that perform way better then the plain boolean
> > rewrite method. To achieve faster wildcardqueries you will end up with
> > a constant score instead of the normal lucene score a booleanquery
> > would create. So each hit will have the same constant score assigned
> > for your WildCardQuery.
> >
> > Look at org.apache.lucene.search.MultiTermQuery.RewriteMethod for
> details.
> >
> > hope that helps
> >
> > simon
> >
> > On 12/22/09, Claudio Deluca <decla86@gmail.com> wrote:
> >> Hello,
> >>
> >> We currenty have implemeted a search for person by surname and forename
> with
> >> lucene 2.4.1.
> >> If both seach fields are filled, then we combine the WildcardQuerys in a
> >> BooleanQuery.
> >> *
> >> BooleanQuery theQuery = new BooleanQuery();
> >> theQuery.add(new WildcardQuery(new Term("surname", "foo")), Occur.MUST);
> >> theQuery.add(new WildcardQuery(new Term("forename", "b*")),
> Occur.MUST);*
> >> *LuceneSearcherFactory theSearcherFactory =
> >> LuceneSearcherFactory.getInstance();
> >> Searcher theSearcher = theSearcherFactory.getSearcher();
> >> theRewritten = theSearcher.rewrite(theQuery);
> >>
> >> *In the database there is exactly one Person with surname "foo". When i
> >> comment the second term (forename) search works fine.
> >> If i run search including term "forename" "b*", the Searcher throws an
> >> TooManyClauses Exception white trying to rewrite the Query.
> >> While rewriting the searcher seems to find too many possibilities for
> >> forenames beginning with "b".
> >>
> >> How do i have to combine the terms so that lucene search works properly?
> >>
> >> Thanks,
> >> Claudio
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message