lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Polites" <jason.poli...@gmail.com>
Subject Re: Stop words in index
Date Sat, 02 Sep 2006 16:56:21 GMT
Roger that.  I'll double check my code.

Thanks.

On 9/3/06, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
>
> They shouldn't be in the index.  You must be using StandardAnalyzer
> incorrectly, or maybe you think you are using it, but are really using
> something else.
>
> Otis
>
> ----- Original Message ----
> From: Jason Polites <jason.polites@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Saturday, September 2, 2006 9:05:27 AM
> Subject: Stop words in index
>
> Hey all,
>
> I am using the StandardAnalyzer with my own list of stop words (which is
> more comprehensive than the default list), and my expectation was that
> this
> would omit these stop words from the index when data is indexed using this
> analyzer.  However, I am seeing stop words in the term vector for
> documents
> indexed with this analyzer.
>
> Is this expected behaviour?  Is there any way I can force these stop words
> to be omitted from the index?  Having them in the index is wreaking havoc
> with term vector analysis to determine document similarity.
>
> Thanks.
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message