lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Large scale sorting
Date Wed, 11 Apr 2007 07:12:23 GMT
: I'm wondering then if the Sorting infrastructure could be refactored
: to allow  with some sort of policy/strategy where one can choose a
: point where one is not willing to use memory for sorting, but willing


: To accomplish this would require a substantial change to the
: FieldSortHitQueue et al, and I realize that the use of NIO

I don't follow ... why could this be implemented entirely via a new
SortComparatorSource?  (you would also need something to create your file,
but that could probably be done as a decorator or subclass of IndexWRiter
couldn't it?)

: immediately pins Lucene to Java 1.4, so I'm sure this is
: controversial.  But, if we wish Lucene to go beyond where it is now,

Java 1.5 is controversial, Lucene already has 1.4 dependencies.

: I think we need to start thinking about this particular problem
: sooner rather than later.

it depends on your timeline, Lucene's gotten pretty far with what it's
got.  Personally i'm banking on RAM getting cheaper fast enough that I
won't ever need to worry about this.

If i needed to support sorting on lots of fields with lots of differnet
locales, and my index was big enough that i couldn't feasibly keep all of
the FieldCaches in memory on one box, i wouldn't partition the index
across multiple boxes and merge results with a MultiSearcher ... i'd clone
the index across multiple boxes and partition the traffic based on the
field/locale it's searching on.

it's a question of cache management, if i know i have two very differnet
use cases for a Solr index, i partition those use case to seperate tiers
of machines to get better cache utilization, FieldCache is
just another type of cache.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message