lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Status: Sorting on tokenized fields
Date Sat, 23 Sep 2006 01:09:34 GMT

: for years there is the discussion to make lucene able to sort on TOKENIZED
: fields.

really? .. i've only been on the list since 1.4.3 but i don't remember it
being much of a recurring topic.

: (e.g. if more then one term is available concatenate the tokens OR use the
: stored value for sorting).

using the stored value doesn't help: there can be multiple stored values
just as easily as there can be multiple tokens.

concatenating the tokens is a vague concept that would be very hard to get
right in a way that would work genericly:  for starters, how do you deal
with tokens at the same position? (ie; synonyms)

In my experience, the best way to deal with this is for the application
using Lucene to decide which fields it wants to sort on, and make a
"sortable" version of that field that is indexed by not tokenized -- the
application is afterall in teh best position to decide how exactly it
wnats to "sort" on the data (ie: should the values be lowercased so the
sort is case-insensetive?  should certain punctution characters be striped
out? etc...)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message