lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Quotes dependent StopWords removal
Date Wed, 16 Aug 2006 00:31:37 GMT
This appears tricky to me. I may be completely wrong but I would start 
by looking at the Standard Analyzer. I would try and create a new token 
that matched an open parenthesis. I would then change the next() method 
in StandardAnalyzer.jj to mark when it recognizes an open parenthesis. 
Now you are in a quote. Somehow mark each token (might not be an obvious 
way to do this) until you find another close parenthesis. Now mark that 
you are not in a quote. When not in a quote do not mark the tokens 
coming out of Next(). ) Now in the Stop Filter, check the token for your 
marker and do not remove it if it is marked.

Bear in mind...this all may be worthless speculating...

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message