lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Sorting case-insensitively
Date Fri, 18 Jul 2008 17:31:42 GMT

: > if you could submit a test case that
	...
: See my e-mail dated July 3, 2008.

Sorry: i ment open a bug (in Jira) and submit a JUnit test case.  I also 
ment something even simpler so the lower casing doesn't confuse the issue 
ie:
        class IdentitySortComparator extends SortComparator {
            public Comparable getComparable( String termText ) {
                return termText;
            }
        } 

: > Assuming i'm right, I don'treally have any good work arround suggestion
: > for you beyond overriding newComparator() in your SortComparator subclass
: > to explicitly test for null yourself.
: 
: And what do I do if it is null?

whatever you want ... treat null as less then everything else, or treat 
null as greater then everything else .... hmmmmmm ... except (still 
assuming my hunch is right) no matter what you do to make 
yourComparable.compareTo(null) behave well, there's still nothing you can 
do to prevent the case where yourComparable itself might be null.  we're 
back to the same problem of FieldCache not giving people any way to 
specify what the "default" value in the cache should be for docs that have 
no Terms indexed for that field.

Bottom line: the fact that the built in int/float/string/etc sorting works 
even when a doc has no value for a field is really kind of a fluke -- it's 
pretty much impossible to get the same behavior using "Custom" sorting 
(and i can't think of an easy way to fix it without gutting the API).

Even if someone else has a good idea to make it work, I would still 
*strongely* recommend you just index the lowercase form in an alternate 
field -- it's going to be faster, both because the lowercasing is all done 
at index time not at search time, and assuming a moderate overlap of 
field value the StringIndex FieldCache type will probably use less RAM 
then the simple Comparable FieldCache type containging lowercase strings.

-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message