lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Sorting consumes hundreds of MBytes RAM
Date Mon, 14 Apr 2008 22:27:57 GMT

: How does this work internally? It seems as if all data for this field found in 
: the entire index is read into memory (?).

You can think of it as an "inverted-inverted index"  Lucene needs a data 
structure it can usefor fast lookups where the key is the docId and the 
value is something "comparable" for sorting the documents.

: And question #2: what am I going to do against it? Index  sharding?

The only suggestion i can offer is to take a look at LUCENE-769 ... it 
takes a completley differnet appraoch of using a FieldSelector to access 
the *stored* field and sort on it ... the memory usage of FieldCache is 
eliminatedand the expense of longer search times ... in cases where you 
expect queries to match on a very small subset of the total index, it 
could be worth using.

If people try out the patch and like it and report back success with it, 
it's more likely to get commited at some point.  (allthough at this point, 
i'm starting to suspect "column stride fields" is the wave of the future 
for stuff like this ... see LUCENE-1231 for more details, butat this 
point it's totally theoretical)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message