lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: CharArraySet makes excessive use of Character.toLowerCase
Date Sun, 28 Dec 2008 15:55:26 GMT
I think all of these changes make sense!
I would just remove that if (false &&...) dead code.

Mike

Shai Erera <serera@gmail.com> wrote:

> Hi
>
> I noticed that CharArraySet uses Character.toLowerCase in many places in
> the code. I think it uses it too much unnecessarily.
> For example, the equals(char[], int, int, char[]) method converts the
> characters to lower case if ignoreCase is true, although it could have been
> converted in one of the appropriate public APIs (like add or contains).
> Moreover, the equals() method may be called several times during the same
> add/contains call, and the value will still be lowercased every time.
>
> The same logic cannot be applied on the CharSequence contains API though,
> for efficiency reasons. The rest (including all the add() methods) can be
> changed to first convert the value to lowercase form (if ignoreCase is
> false) and then simplify getSlot() and equals() by removing the ignoreCase
> check from them.
>
> What do you think? I can prepare a patch if you agree to make this change.
>
> Also, few minor changes we can push through in this patch are:
> One
> ------
> add(Object) looks like this:
>     if (o instanceof char[]) {
>       return add((char[])o);
>     } else if (o instanceof String) {
>       return add((String)o);
>     } else if (o instanceof CharSequence) {
>       return add((CharSequence)o);
>     } else {
>       return add(o.toString());
>     }
> I don't think there's a need to check for instanceof String. Instead it
> could be written simply like this:
>     if (o instanceof char[]) {
>       return add((char[])o);
>     }
>     return (add(o.toString());
> The reason is that add(String) and add(CharSequence) eventually do the
> same. If the identification of CharSequence is still important, it can be
> written like this:
>     if (o instanceof char[]) {
>       return add((char[])o);
>     } else if (o instanceof CharSequence) {
>       return add((CharSequence)o);
>     } else {
>       return add(o.toString());
>     }
>
> Two
> ------
> getHashCode(CharSequence) has this check:
>       if (false && text instanceof String) {
>         code = text.hashCode();
> Why is it there? It will always fail.
>
> Shai
>

Mime
View raw message