lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: StandardAnalyzer exclude numbers
Date Mon, 22 Sep 2008 12:36:13 GMT
jim@tera.gr wrote:
> Hello
>
> Is it possible to exclude numbers using StandardAnalyzer just like 
> SimpleAnalyzer?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
Its possible but its tricky. You would want to copy the StandardAnalyzer 
into your own Analyzer and then modify the grammar. 
StandardTokenizerImpl.jflex is where to look, but you will have to learn 
how to use/compile jflex (look at the build file) to build the parser 
classes. What you would do though, is start by trying to remove the 
digit from the Alphanum regex in StandardTokenizerImpl.jflex. You might 
want to rename alphanum after such a move. That may be as far as you 
need to go.


- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message