lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Norouzi" <>
Subject Re: WhitespaceAnalyzer [was: Re: regaridng Reader.terms()]
Date Thu, 24 May 2007 05:32:43 GMT
Hi Steven
Thank you so much for your thorough comments about Analyzer

I write that class a couple of months ago, now I take a look at my
customized Analyzer

the only change I've made as follows:

the original class has this method:
protected boolean isTokenChar(char c) {
    return !Character.isWhitespace(c);

And my class override that method as this:

protected boolean isTokenChar(char c) {
    return !((int)c==32);

I think the Character.isWhitespace consider the unicodes as space :))
so everything will mess up.

what do you think?

see my blog:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message