lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Sort suggestion
Date Tue, 29 Jul 2008 19:17:39 GMT
I think you'll find it slow to add disk seeks in the sort on each 
search. Something you might be able to work from though (though I doubt 
it still applys cleanly) is Hoss' issue This allows for a 
pluggable cache implementation for sorting. Also allows for much faster 
reopening in most cases - hasn't seen any activity, and I think they are 
looking to get the reopen gains elsewhere, but it may be worth playing with.

- Mark

Marcus Herou wrote:
> Guys.
> I've noticed many having trouble with sorting and OOM. Eventually they solve
> it by throwing more memory at the problem.
> Should'nt a solution which can sort on disk when neccessary be implemented
> in core Lucene ?
> Something like this:
> Since you obviously know the result size you can calculate how much memory
> is needed for the sort and if the calculated value s higher then a
> configurable threshold an external on disk sort is performed and perhaps a
> logging message which states something on a WARN level.
> Just a thought since I'm about to implement something which could sort any
> Comparable object but on disk.
> Guess the Hadoop project have the perfect tools for this since everything
> the mapred inputfiles are sorted, on disk and huge.
> Kindly
> //Marcus

View raw message