lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Taylor <watay...@as-st.com>
Subject Re: Installing a custom tokenizer
Date Tue, 29 Aug 2006 19:30:30 GMT

On Aug 29, 2006, at 2:47 PM, Chris Hostetter wrote:

>
> : Have a look at PerFieldAnalyzerWrapper:
>
> :  
> http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/ 
> PerFieldAnalyzerWrapper.html
>
> ...which can be specified in the constructors for IndexWriter and
> QueryParser.

As I understand it, this allows me to specify a different analyzer for  
each field name.  My problem is that the standard analyzer will not  
work for my content field and I need to define a new one.  I need to  
make a modification to the StandardTokenizer so that a number does not  
need to have a digit in every other segment of a part number.

For example, the StandardTokenizer breaks aa-bb-2 on the - between aa  
and bb because it demands that every other string between a - have a  
digit.

I need to modify the .jj file for the Standard Tokenizer and get a new  
one, but I am confused by the javaCC documentation and do not know how  
to run it to get what I need.

Thanks for the help.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message