lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: PrefixQuery with short prefix does not match documents
Date Sat, 25 May 2013 15:44:59 GMT
I suspect this is because you set TopTermsScoringBooleanQueryRewrite
method on the PrefixQuery: this will keep "only" the top 10K terms, so
if g* matches more than 10K terms, some terms are dropped.

You may want to index short prefixes into the index instead, e.g.
using EdgeNGramFilter, and then cutover to PrefixQuery when the prefix
is "long enough".

This is the approach I took with the index-based suggester on
https://issues.apache.org/jira/browse/LUCENE-4845 ...

Mike McCandless

http://blog.mikemccandless.com


On Fri, May 24, 2013 at 7:06 PM, Steven Schlansker <steven@likeness.com> wrote:
> Hi everyone,
>
> I am building an autocomplete index.  The index contains both the names and a small set
of fixed types.
> The intention is that type matches will always come first, followed by name matches.
>
> I am using a PrefixQuery to do substring matching.  Confusingly, I am finding that very
short prefix
> matches sometimes will return no results when combined with an additional filter.
>
> For example, I have a document "body:german type:TYPE".  The query "+(type:TYPE) +body:ge*"
matches this document.
> The query "+(type:TYPE) +body:g*" does not.  Double confusingly, it works fine in Luke
-- just not when I build the query by hand.
>
> Here is how I create the document:
>
> Document doc = new Document();
> doc.add(new Field("body", "German", TextField.TYPE_STORED));
> doc.add(new Field("type", "TYPE", StringField.TYPE_STORED));
>
> Here is how I build the query:
>
> Query allowedTypes = new BooleanQuery();
> allowedTypes.add(new TermQuery(new Term("type", "TYPE")), Occur.SHOULD);
>
>
> Query prefixQuery = new PrefixQuery(new Term("body", "ge"));
> prefixQuery.setRewriteMethod(new MultiTermQuery.TopTermsScoringBooleanQueryRewrite(10000));
>
> Query mainQuery = new BooleanQuery();
> mainQuery.add(allowedTypes, Occur.MUST);
> mainQuery.add(prefixQuery, Occur.MUST);
>
> Am I missing something obvious?
>
> Thanks,
> Steven Schlansker
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message