lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: is this the right way to go?
Date Wed, 09 Jun 2010 22:41:41 GMT
In addition to Ian's comment, an important question is what
kind of values you're sorting on. It sounds like a time stamp,
because most languages only have a (relatively) small number
of terms.

It's not the total terms in the field, it's the total *unique* terms
in the field. So even with a very large text field in, say, English,
there are only around 300,000 terms and that will be the max
that are loaded into memory.

So, as always, details matter <G>.


On Wed, Jun 9, 2010 at 4:22 PM, fujian <> wrote:

> Hello,
> We are using lucene 2.9.0. and ran into OutOfMemory error when sorting on a
> highly unique field on a big index. After doing some research we learned
> that lucene will load the sort field value for all documents into memory to
> do sorting, and ended up with the OutOfMemory if the index is too big.
> The problem is no matter how much heap size you allocated for jvm, you will
> run into this problem eventually when your index is big enough and the sort
> field value is unique enough regardless how small amount of documents
> actually match the query ...
> This is a bit ... annoying. I only want to sort the 10 docs that match the
> query, why I need to load millions of field values into memory?
> Then we are thinking to disable the sort from lucene search and do the
> sorting ourselves. that is, after getting all matching docs from
> search(since we know the matching docs won't be too much), we do the
> sorting
> "manually". This requires us write our own sorting methods and it needs to
> support multiple field sort.
> so my question is is this the right way to go? or maybe lucene/solr already
> has the similiar thing?
> Thank you very much,
> -Fujian
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message