lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Huber <gregh3...@gmail.com>
Subject Re: Strange results returned from suggester
Date Sun, 29 Jan 2017 11:47:34 GMT
Uwe,

>...or use CustomAnalyzer then you don't need to
> subclass. Just decare the components.

If I need the StandardAnalyzer code (marked final) and this extends
StopwordAnalyzerBase, how would I do this?

Cheers Greg

On 29 January 2017 at 11:32, Uwe Schindler <uwe@thetaphi.de> wrote:

> ...or use CustomAnalyzer then you don't need to subclass. Just decare the
> components.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Michael McCandless [mailto:lucene@mikemccandless.com]
> > Sent: Sunday, January 29, 2017 12:28 PM
> > To: Greg Huber <gregh3269@gmail.com>; Lucene Users <java-
> > user@lucene.apache.org>
> > Subject: Re: Strange results returned from suggester
> >
> > That's right, just make your own analyzer, forked from
> > StandardAnalyzer, and change out the StopFilter.  The analyzer is a
> > tiny class and this (creating your own components in an analyzers) is
> > normal practice...
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Sat, Jan 28, 2017 at 6:09 AM, Greg Huber <gregh3269@gmail.com> wrote:
> > > Michael,
> > >
> > > Thanks for the update, so I just duplicate StandardAnalyzer and
> replace :
> > >
> > >
> > > //tok = new StopFilter(tok, stopwords);
> > >   tok = new SuggestStopFilter(tok, stopwords);
> > >
> > > in createComponents(..)
> > >
> > > Is there a way I can just override the method as in
> AnalyzingInfixSuggester
> > > rather than duplicating classes?
> > >
> > >
> > > Cheers Greg
> > >
> > > On 28 January 2017 at 10:31, Michael McCandless
> > <lucene@mikemccandless.com>
> > > wrote:
> > >>
> > >> Hi Greg,
> > >>
> > >> OK StandardAnalyzer does indeed use StopFilter, with English stop
> > >> words by default, which includes "will", so this explains what you are
> > >> seeing.
> > >>
> > >> I suggest making your own analyzer just like StandardAnalyzer, except
> > >> instead of StopFilter use the SuggestStopFilter class.
> > >>
> > >> That class was created for exactly the situation you're in, so that
> > >> "will" would not be filtered out as a stop word, but "will " is
> > >> (because it ends with a token separator).
> > >>
> > >> Either that or pass an empty stop word set to StandardAnalyzer, but
> > >> then you have no stop word filtering.
> > >>
> > >> This short blog post explains SuggestStopFilter:
> > >>
> > >> http://blog.mikemccandless.com/2013/08/suggeststopfilter-carefully-
> > removes.html
> > >>
> > >> Mike McCandless
> > >>
> > >> http://blog.mikemccandless.com
> > >>
> > >>
> > >> On Sat, Jan 28, 2017 at 3:39 AM, Greg Huber <gregh3269@gmail.com>
> > wrote:
> > >> > Michael,
> > >> >
> > >> > I am using the standard analyzer eith no stop words, and is build
> from
> > >> > an
> > >> > existing lucene index.
> > >> >
> > >> > org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester
> > >> >
> > >> > I am overriding the addContextToQuery to make it an AND rather than
> > an
> > >> > OR
> > >> >
> > >> > public void addContextToQuery(Builder query, BytesRef context, Occur
> > >> > clause)
> > >> > {
> > >> >         query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
> > context)),
> > >> >                 BooleanClause.Occur.MUST);
> > >> >     }
> > >> >
> > >> > Cheers Greg
> > >> >
> > >> > On 27 January 2017 at 18:20, Michael McCandless
> > >> > <lucene@mikemccandless.com>
> > >> > wrote:
> > >> >>
> > >> >> Which suggester are you using?
> > >> >>
> > >> >> Maybe you are using a suggester with an analyzer, and your analysis
> > >> >> chain includes a StopFilter and "will" is a stop word?
> > >> >>
> > >> >> Mike McCandless
> > >> >>
> > >> >> http://blog.mikemccandless.com
> > >> >>
> > >> >>
> > >> >> On Fri, Jan 27, 2017 at 10:42 AM, Greg Huber <gregh3269@gmail.com>
> > >> >> wrote:
> > >> >> > Hello,
> > >> >> >
> > >> >> > Is there anyway to see why items are returned from the suggester?
> > >> >> > Similar
> > >> >> > to the search.
> > >> >> >
> > >> >> > I have a really strange case where if I enter 'will' (without
the
> > >> >> > quotes)
> > >> >> > it seems to return all the search results.
> > >> >> >
> > >> >> > example:
> > >> >> >
> > >> >> > there should be two entries beginning with will*  ie william
and
> > >> >> > Willoughby
> > >> >> >
> > >> >> > wil >  two entries with correct highlight
> > >> >> > will > all entries with NO highlight
> > >> >> > willi > single entry
> > >> >> > willo > single entry
> > >> >> >
> > >> >> > I have checked and I do not have will on all the entries!
> > >> >> >
> > >> >> > Cheers Greg
> > >> >
> > >> >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message