lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Rowe <>
Subject Re: Invalid Sort results when column contains link names of like "*.html,*.xls,*.doc etc."
Date Mon, 02 Apr 2007 14:24:03 GMT
Hi devanadan,

deva_java wrote:
> In my application,I wanna sort out index based on different column names in
> ascending or descending order but it doesn't do correctly
> For example:
> it does well as its column contains single word or phrase but in the
> phrase type contains differnet links like .html or .doc or .xls it 
> does sorting based on .extension so that i got invalid sorting
> results like "A series in middle part".

This question, one of Lucene usage rather than of development, really
belongs on the java-user list instead of java-dev.

I'm guessing that the sort field you're using is tokenized - if so, this
is a problem.

>From the API documentation[1]:

    The fields used to determine sort order must be carefully chosen.
    Documents must contain a single term in such a field .... The field
    must be indexed, but should not be tokenized, and does not need to
    be stored (unless you happen to want it back with the rest of your
    document data). In other words:

      document.add(new Fieldable("byNumber", Integer.toString(x),
                   Fieldable.Store.NO, Fieldable.Index.UN_TOKENIZED));

Your options (in order of likely quickest results):

1. Switch the field to be sorted so that it is not tokenized.  Things
will likely "just work" if you do this.

2. If the field you're using must remain tokenized, add a new field that
is a duplicate of the current sort field, but make it untokenized, and
then make this new field the one you sort on.

3. There is an ongoing discussion at issue #252 in JIRA[2] that may be
of interest to you if neither of the above options is feasible.



[2] <>

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message