lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Strange results returned from suggester
Date Sat, 28 Jan 2017 10:31:12 GMT
Hi Greg,

OK StandardAnalyzer does indeed use StopFilter, with English stop
words by default, which includes "will", so this explains what you are
seeing.

I suggest making your own analyzer just like StandardAnalyzer, except
instead of StopFilter use the SuggestStopFilter class.

That class was created for exactly the situation you're in, so that
"will" would not be filtered out as a stop word, but "will " is
(because it ends with a token separator).

Either that or pass an empty stop word set to StandardAnalyzer, but
then you have no stop word filtering.

This short blog post explains SuggestStopFilter:
http://blog.mikemccandless.com/2013/08/suggeststopfilter-carefully-removes.html

Mike McCandless

http://blog.mikemccandless.com


On Sat, Jan 28, 2017 at 3:39 AM, Greg Huber <gregh3269@gmail.com> wrote:
> Michael,
>
> I am using the standard analyzer eith no stop words, and is build from an
> existing lucene index.
>
> org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester
>
> I am overriding the addContextToQuery to make it an AND rather than an OR
>
> public void addContextToQuery(Builder query, BytesRef context, Occur clause)
> {
>         query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME, context)),
>                 BooleanClause.Occur.MUST);
>     }
>
> Cheers Greg
>
> On 27 January 2017 at 18:20, Michael McCandless <lucene@mikemccandless.com>
> wrote:
>>
>> Which suggester are you using?
>>
>> Maybe you are using a suggester with an analyzer, and your analysis
>> chain includes a StopFilter and "will" is a stop word?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, Jan 27, 2017 at 10:42 AM, Greg Huber <gregh3269@gmail.com> wrote:
>> > Hello,
>> >
>> > Is there anyway to see why items are returned from the suggester?
>> > Similar
>> > to the search.
>> >
>> > I have a really strange case where if I enter 'will' (without the
>> > quotes)
>> > it seems to return all the search results.
>> >
>> > example:
>> >
>> > there should be two entries beginning with will*  ie william and
>> > Willoughby
>> >
>> > wil >  two entries with correct highlight
>> > will > all entries with NO highlight
>> > willi > single entry
>> > willo > single entry
>> >
>> > I have checked and I do not have will on all the entries!
>> >
>> > Cheers Greg
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message