lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Herbert Wu" <...@welpine.com>
Subject RE: Special characher & ; : % index/search question
Date Mon, 24 Jul 2006 15:34:13 GMT
Hi, Martin,
This may work if I can assume which field to contain the special chars. I
will look over the data and see if it is possible.
Thanks.
-Herbert

-----Original Message-----
From: Martin Braun [mailto:mbraun@uni-hd.de] 
Sent: Monday, July 24, 2006 2:43 AM
To: java-user@lucene.apache.org
Subject: Re: Special characher & ; : % index/search question

hi herbert,
>> WhitespaceAnalyzer looks brutal. Is it possible that I keep
>> StandardAnalyzer and at the same time to tell the parser to keep a
>> list of chars during indexing? 

Perhaps it would be sufficient to use the WhitespaceAnalyzer and keep
StandardAnalyzer for the other fields by using a PerFieldAnalyzerWrapper?

> 
> Add something like:
> 
> | < #MYCHARACTERS:
>       ("&" | ":" | "%" | ";")
>   >
> 
> to the StandardTokenizer.jj and rebuild it.
> 
> Might cause some lexical indeterministic errors, so look out for those.

... and you have to remember to do this again on each lucene-update.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message