lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Norouzi" <mnr...@gmail.com>
Subject Re: WhitespaceAnalyzer [was: Re: regaridng Reader.terms()]
Date Thu, 24 May 2007 05:34:43 GMT
Sorry Steven
that change is in WhitespaceTokenizer not WhiteSpaceAnalyzer but in Analyzer
I had to call the tokenizer



On 5/24/07, Mohammad Norouzi <mnrz57@gmail.com> wrote:
>
> Hi Steven
> Thank you so much for your thorough comments about Analyzer
>
> I write that class a couple of months ago, now I take a look at my
> customized Analyzer
>
> the only change I've made as follows:
>
> the original class has this method:
> protected boolean isTokenChar(char c) {
>     return !Character.isWhitespace(c);
> }
>
> And my class override that method as this:
>
> protected boolean isTokenChar(char c) {
>     return !((int)c==32);
> }
>
>
> I think the Character.isWhitespace consider the unicodes as space :))
> so everything will mess up.
>
> what do you think?
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/




-- 
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message