lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Lacerda <pslace...@gmail.com>
Subject Re: How to avoid filtering stop words like "IS" in StandardAnalyzer
Date Mon, 30 Jan 2012 10:22:44 GMT
I didn't know about CharArraySet.EMPTY_SET, thanks.

Pedro Lacerda



2012/1/29 Uwe Schindler <uwe@thetaphi.de>

> Hi,
>
> If you want to disable *all* stop words, then CharArraySet.EMPTY_SET is the
> right choice. For performance reasons you should also use CharArraySet for
> non-empty stop words instead of simple HashSet<String>.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Cheng [mailto:zhoucheng2008@gmail.com]
> > Sent: Sunday, January 29, 2012 3:33 AM
> > To: java-user@lucene.apache.org
> > Subject: Re: How to avoid filtering stop words like "IS" in
> StandardAnalyzer
> >
> > Pedro's suggestion seems to work fine. Not sure where I should use
> > CharArraySet.EMPTY_SET.
> >
> > On Sat, Jan 28, 2012 at 6:56 AM, Uwe Schindler <uwe@thetaphi.de> wrote:
> >
> > > Or even better: CharArraySet.EMPTY_SET - sorry for noise.
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > > > -----Original Message-----
> > > > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > > Sent: Saturday, January 28, 2012 12:52 PM
> > > > To: java-user@lucene.apache.org
> > > > Subject: RE: How to avoid filtering stop words like "IS" in
> > > StandardAnalyzer
> > > >
> > > > Right, but Collections.emptySet() should be used :-)
> > > >
> > > > -----
> > > > Uwe Schindler
> > > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > > eMail: uwe@thetaphi.de
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Pedro Lacerda [mailto:pslacerda@gmail.com]
> > > > > Sent: Saturday, January 28, 2012 12:49 PM
> > > > > To: java-user@lucene.apache.org
> > > > > Subject: Re: How to avoid filtering stop words like "IS" in
> > > > > StandardAnalyzer
> > > > >
> > > > > Hi Cheng,
> > > > >
> > > > > You can provide your own set of stop words as the second argument
> of
> > > > > StandardAnalyzer constructor.
> > > > >
> > > > > new StandardAnalyzer(version, new HashSet());
> > > > >
> > > > >
> > > > >
> > > > > Pedro Lacerda
> > > > >
> > > > >
> > > > >
> > > > > 2012/1/28 Cheng <zhoucheng2008@gmail.com>
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I don't want to filter certain stop words within the
> > > StandardAnalyzer?
> > > > > > Can I do so?
> > > > > >
> > > > > > Ideally, I would like to have a customized StandardAnalyzer.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message