lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig W Conway <>
Subject Re: Out of memory exception for big indexes
Date Tue, 17 Apr 2007 22:11:54 GMT
I wanted to reply on this because it seems that a few people have had similar problems and
I am very pleased with the simple solution. 

Basically I re-created the index using OMIT NORMS where ever possible, as well as Field.Store.NO
where ever possible. The result shrunk my index from 19GB to 4GB and eliminated my OOM errors.
Performance is much better as well. My queries are still sorted and complex. I have ~4mil
docs indexed.

Thank you Lucene community!

----- Original Message ----
From: Chris Hostetter <>
Sent: Friday, April 6, 2007 11:00:53 AM
Subject: Re: Out of memory exception for big indexes

: Would it be fair to say that you can expect OutOfMemory errors if you
: run complex queries? ie sorts, boosts, weights...

not intrinsicly ... the amount of memory used has more to do with the size
of hte index and the sorting done then it does with teh number of clauses
in your query (of course, having more complex clauses - like
FunctionQueries that use FieldCaches) can use more memory.

: +(pathNodeId_2976569:1^5.0 pathNodeId_2976969:1 pathNodeId_2976255:1 pathNodeId_2976571:1)
+(pathClassId:1 pathClassId:346 pathClassId:314) -id:369

...the big thing that jumps out at me here is that you seem to be using
very dynamic field names for storing boolean values ... have your set the
OMIT_NORMS option on all of those fields? ... having norms makes your
index bigger, and contributes to the memory usage at query time 9and
doesn't add any benefit to a query like this )

: java.lang.OutOfMemoryError: Java heap space
: Dumping heap to java_pid4512.hprof ...
: Heap dump file created [71421503 bytes in 2.640 secs]
: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
:         at org.apache.lucene.index.MultiReader.norms(

...that could indicate that norms are your problem, or you could already
have norms turned off on all but one field, and it's just the straw that
breaks the camels back ... looking at a visualization of your heap is the
only thing that's really going to tell you what is taking up all your ram.


To unsubscribe, e-mail:
For additional commands, e-mail:

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message