lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From manjula wijewickrema <manjul...@gmail.com>
Subject Re: Editing StopWordList
Date Tue, 21 Dec 2010 10:24:47 GMT
Hi Gupta,

Thanx a lot for your reply. But I could not understand whether I could
modify (adding more words) to the default stop word list or should I have to
make a new list as an array as follows.
public string[] NEW_STOP_WORDS = { "a", "and", "are", "as", "at", "be",
"but", "by", "for", "if", "in", "into", "is", "no", "not", "of", "on", "or",

"s", "such", "t", "that", "the", "their", "then", "there", "these", "they",
"this", "to", "was", "will", "with",
"inc","incorporated","co.","ltd","ltd.", "we", "you", "your", "us",
etc...};

then call it as follows,

SnowballAnalyzer analyzer = *new* SnowballAnalyzer("English",
StopAnalyzer.NEW_STOP_WORDS );
Am I correct?
Or if not could you explain me how can I do this?

Thanx in advance.
Manjula.

On Tue, Dec 21, 2010 at 10:36 AM, Anshum <anshumg@gmail.com> wrote:

> Hi Manjula,
> You could initialize the Analyzer using a modified stop word set. Use
> the *StopAnalyzer.ENGLISH_STOP_WORDS_SET
> *to get the default stopset and then add your own words to it. You could
> then initialize the analyzer using this new stop set instead of the default
> stop set.
> Hope that helps.
>
> --
> Anshum Gupta
> http://ai-cafe.blogspot.com
>
>
> On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
> <manjula53@gmail.com>wrote:
>
> > Hi,
> >
> > 1) In my application, I need to add more words to the stop word list.
> > Therefore, is it possible to add more words into the default lucene stop
> > word list?
> >
> > 2) If is it possible, then how can I do this?
> >
> > Appreciate any comment from you.
> >
> > Thanks,
> > Manjula.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message