lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: PrefixQuery.rewrite
Date Fri, 17 Apr 2009 21:07:56 GMT
Hi Dave,

The code is correct, here my comments:

> This
> code, as I understand it, is designed to expand a prefix wildcard and
> rewrite the query as a long boolean series of ANDs.
> 
> To improve performance the code has a Break statement designed to kick
> out of the TermEnum starts enumerating on another field.
> 
>   //FROM /src/java/org/apache/lucene/search/PrefixQuery.java
>   public Query rewrite(IndexReader reader) throws IOException {
>     BooleanQuery query = new BooleanQuery(true);

Here a new TermEnum is created, which starts at the term prefix=new
Term(field,prefixText). The TermEnum is ordered by (field,termtext).
Reader.terms(term) retrieves a TermEnum that is positioned exactly at the
given term or, if that not exists, at the next one following the requested
term (in the above described order):

>     TermEnum enumerator = reader.terms(prefix);
>     try {
>       String prefixText = prefix.text();
>       String prefixField = prefix.field();
>       do {
>         Term term = enumerator.term();

This check does exactly what you think, it is the exit condition:
If the term is from another field, exit
If the term is null, the enumeration is exhausted, exit
If the term does not start with the prefix, also exit. This condition is
enough. If the initial positioning of the enum was exactly on a term with
the prefix (the prefix term itself), it is really the first, and no term was
forgotten. If the initial term was not exactly the same but bigger, it can
be two different cases:
a) it starts with the prefix -> iterate further
b) it does not start with the prefix, there were never be a term with that
prefix.

>         if (term != null &&
>             term.text().startsWith(prefixText) &&
>             term.field() == prefixField) // interned comparison
>         {
>           TermQuery tq = new TermQuery(term);	  // found a match
>           tq.setBoost(getBoost());                // set the boost
>           query.add(tq, BooleanClause.Occur.SHOULD);		  // add
> to query
>           //System.out.println("added " + term);
>         } else {
>           break;
>         }
>       } while (enumerator.next());
>     } finally {
>       enumerator.close();
>     }
>     return query;
>   }
> 
> I think that there may be a logic problem here - - - to me it seems that
> if I performed a prefix query on a Field that wasn't first in line
> during the the TermEnum's output that my prefix would never be expanded.
> I may be misunderstanding the ordering that IndexReader.terms(Term)
> produces.


Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message