lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Condit <con...@sdsc.edu>
Subject Re: Best practice for stemming and exact matching
Date Fri, 01 Apr 2011 16:09:14 GMT
>> Ideally I'd like to have the parser use the
>> custom analyzer for everything unless it's going to parse a clause into
>> a PhraseQuery or a MultiPhraseQuery, in which case it uses the
>> SimpleAnalyzer and looks in the _exact field - but I can't figure out
>> the best way to accomplish this.
> 
> You might be interested in the example i added to the QP tests in
> https://issues.apache.org/jira/browse/LUCENE-2892,
> it does something similar to what you are doing.
> 
> your situation is probably even easier than my example actually:
> 
> @Override
> protected Query getFieldQuery(String field, String queryText, boolean
> quoted) throws ParseException {
>   if (quoted) {
>     field = field + "_exact"; // lie about the field to superclass if
> the user quoted it.
>   }
>   return super.getFieldQuery(field, queryText, quoted);
> }
> 
> if you are using perfieldanalyzerwrapper so that these "_exact" fields
> don't use stemming, then it should just work.
> you'll need branch_3.x or trunk for this, but looks like 3.1 is close

Using the 3.1 jars from Maven central the following test case shows
different behavior:
http://www.pastie.org/1744131

The output is:
getFieldQuerySlop: foo bar
a:"foo bar" b:"foo bar"
getFieldQueryQuoted: nacho
getFieldQuerySlop: foo bar
getFieldQuerySlop: chip
getFieldQueryQuoted: baz
+(a:nacho b:nacho) +(a:"foo bar" b:"foo bar") +(a:chip b:chip) +(a:baz
b:baz)

So it would seem that all of the quoted phrases are going through the
slop method (with slop of 0). Would it be safe to alter the field in
that method instead?

Thanks,
-Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message