lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <>
Subject Re: Memory usage
Date Thu, 30 Sep 2010 20:18:25 GMT
You can also sort on a field by using a function query instead of the
"sort=field+desc" parameter. This will not eat up memory. Instead, it
will be slower. In short, it is a classic speed v.s. space trade-off.

You'll have to benchmark and decide which you want, and maybe some
fields need the fast sort and some can get away with the slow one.

On Thu, Sep 30, 2010 at 11:47 AM, Jeff Moss <> wrote:
> I think you've probably nailed it Chris, thanks for that, I think I can get
> by with a different approach than this.
> Do you know if I will get the same memory consumption using the
> RandomFieldType vs the TrieInt?
> -Jeff
> On Thu, Sep 30, 2010 at 12:36 PM, Chris Hostetter
> <>wrote:
>> : There are 14,696,502 documents, we are doing a lot of funky stuff but I'm
>> : not sure which is most likely to cause an impact. We're sorting on a
>> dynamic
>> : field there are about 1000 different variants of this field that look
>> like
>> : "priority_sort_for_<client_id>", which is an integer field. I've heard
>> that
>> : sorting can have a big impact on memory consumption, could that be it?
>> sorting on a field requires that an array of the corrisponding type be
>> constructed for that field - the size of the array is the size of maxDoc
>> (ie: the number of documents in your index, including deleted documents).
>> If you are using TrieInts, and have an index with no deletions, sorting
>> ~14.7Mil docs on 1000 diff int fields will take up about ~55GB.
>> Thats a minimum just for the sorting of those int fields (SortablIntField
>> which keeps a string version of the field value will be signifcantly
>> bigger) and doesn't take into consideration any other data structures used
>> for searching.
>> I'm not a GC expert, but based on my limited understanding your graph
>> actually seems fine to me .. particularly the part where it says
>> you've configured a Max heap of ~122GB or ram, and it's
>> never spend anytime doing ConcurrentMarkSweep.  My uneducated
>> understanding of those two numbers is that you've told the JVM it can use
>> an ungodly amount of RAM, so it is.  It's done some basic cleanup of
>> young gen (ParNew) but because the heap size has never gone above 50GB,
>> it hasn't found any reason to actualy start a CMS GC to look for dea
>> objects in Old Gen that it can clean up.
>> (Can someone who understands GC and JVM tunning better then me please
>> sanity check me on that?)
>> -Hoss
>> --
>>  ...  October 7-8, Boston
>>      ...  Stump The Chump!

Lance Norskog

View raw message