lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: Looking for a code pattern to pass stop words as an attribute
Date Tue, 21 Aug 2012 20:45:54 GMT
Hi Dawid,

Maybe you could use KeywordMarkerFilter, either directly or as a recipe for a StopwordMarkerFilter?
 

Note that KeywordAttribute is used by most (all?) Lucene stemmers, so I wouldn't use KeywordMarkerFilter
if your analysis chain already includes a stemmer.

Steve

-----Original Message-----
From: Dawid Weiss [mailto:dawid.weiss@gmail.com] 
Sent: Tuesday, August 21, 2012 4:34 PM
To: dev@lucene.apache.org
Subject: Looking for a code pattern to pass stop words as an attribute

Seeking advice.

I have an application where I need to know which tokens are stop
words. Most analyzers construct the token stream in a way that those
tokens are filtered out -- this isn't what I need, I want them in, but
marked somehow. The question is how to do it nicely and in a simple
way, possibly reusing existing token filters? I had a few ideas but
they all seem awkward -- let me know if I'm missing something obvious.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Mime
View raw message