lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Sorting
Date Sun, 30 Jul 2006 01:21:44 GMT

: thanks a lot for you helpfull answers :-)) I think I will try i like karl
: suggest, cause i have to update the index every day :-))

All of the suggestions so far assume that you have some way of mapping
each document to a number that indicates it's relative position in the
total space of ordered documents -- which means that if you are updating
the index on a daily basis, you are going to need some serious juggling to
make sure that when a new document comes along, you figure out the "right"
integer to put it into the correct order -- not to mention you have no
solution for the day when you discover that "banana" is in your index with
a sort value of 12345 and "bandana" is in your index with sort value 12346
-- and now you want to add "banco" and you don't have any room for it.

A bigger problem is the fact that sorting by an int field takes
just as much time as sorting by a String field -- because Lucene's sorting
code is already doing the String->int mapping for you and putting it into
a FieldCache.  The only real difference between sorting on an int field
and a String field is how much RAM that FieldCache uses (typically more
for Strings)
(NOTE: there are some caveats to this when dealing with MultiSearcher, but
you didn't mention using a MultiSearcher, so i'm glossing over that)

What *does* take more time when dealing with String fields is building the
FieldCache (because sorting a bunch of strings tends to take longer then
sorting a bunch of ints) ... but this Cache will only be built up the
first time you sort on that field for a given IndexReader/IndexSearcher
... as long as you keep reusing the same IndexSearcher, things should be
"fast".

Without confirmation to the contrary, i'm guessing you aren't reusing the
same IndexSearcher, and that's why it seems like sorting on Strings is
slow.





-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message