lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <>
Subject Re: Right memory for search application
Date Wed, 28 Apr 2010 03:42:41 GMT
Solr's timestamp representation (TrieDateField) is tuned for space and
speed. It has a compressed representation, and sorts with far less
space than Strings.

Also you get something called a date facet, which lets you bucketize
facet searches by time block.

On Tue, Apr 27, 2010 at 1:02 PM, Toke Eskildsen <> wrote:
> Samarendra Pratap [] wrote:
>> 1. Our default option is sort by score, however almost 8% of searches use
>> sorting on a field (yyyymmddHHMMSS). This field is indexed as string (not as
>> NumericField or DateField).
> Guessing that the timestamp is practically unique for each document, sorting by String
takes up a bit more than
> 18M * (40 bytes + 2 * "yyyymmddHHMMSS".length() bytes) ~= 1.2 GB of RAM as the Strings
are cached. Coupled with the normal overhead of just opening an index of your size (500MB
by your measurements?), I would have guessed that 3600MB would definitely be enough to open
the index and do sorted searches.
> I realize that fiddling with production servers is dangerous, but connecting with JConsole
and forcing a garbage collection might be acceptable? That should enable you to determine
whether you're leaking memory or if it's just the JVM being greedy. I'd guess you leaking
though, as HotSpot does not normally allocate up to the limit if it does not need to.
> Anyway, changing to one of the optimized fields for sorting dates should shave 1 GB off
the memory requirement, so I'll recommend doing that no matter what the main cause of your
memory problems is.
> Regards,
> Toke Eskildsen
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Lance Norskog

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message