lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: OutofMemory in large index
Date Sat, 14 Nov 2009 02:28:35 GMT
Hello,

 
Comments inlined.


----- Original Message ----
> From: vsevel <v.sevel@lombardodier.com>
> To: java-user@lucene.apache.org
> Sent: Fri, November 13, 2009 11:32:02 AM
> Subject: Re: OutofMemory in large index
> 
> 
> Hi, I am jumping into the thread because I have got a similar issue.
> My index is 30Gb large and contains 21M docs.
> I was able to stay with 1Gb of RAM on the server for a while. Recently I

Is that 1GB heap or 1GB RAM?

> started to simulate parallel searches. Just 2 parallel searches would get
> the server to crash with out of memory errors. I upgraded the server to 3Gb
> of RAM and I was able to run happily 10 parallel full text searches on my
> documents.
> My questions:
> - is 3Gb a relatively normal amount of memory for a server doing lucene
> searches?

These days 3GB of RAM is very little even for a laptop. :)

> - when is that going to stop? I am planning to have at least 40M docs in my
> index. will I need to go from 2.5 to 5Gb of RAM? what about 60M docs? what
> about 20 concurrent searches?

The more you hit the machine, the more resources it needs.  The more resource intensive the
queries (e.g. sorting?  fuzzy?  wildcard?), the more resources they'll need.

One instance of Lucene/Solr I looked at today has an index with < 5M not very large documents,
but high query rates and relatively expensive queries hitting a 20GB index.  Each of 10 servers
has 8 cores that were only about 30% idle.  This is just an example.  Each case is different.

> - are there any safety mechanisms that would get a search to abort rather
> than make the server crash with out of memory?

I don't think so.  When an app hits OOM, I think it doesn't have much control over its destiny.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> Simon Willnauer wrote:
> > 
> > On Fri, Nov 13, 2009 at 11:17 AM, Ian Lea wrote:
> >>> I got OutOfMemoryError at
> >>> org.apache.lucene.search.Searcher.search(Searcher.java:183)
> >>> My index is 43G bytes.  Is that too big for Lucene ?
> >>> Luke can see the index has over 1800M docs, but the search is also out
> >>> of memory.
> >>> I use -Xmx1024M to specify 1G java heap space.
> >>
> >> 43Gb is not too big for lucene, but it certainly isn't small and that
> >> is a lot of docs.  Just give it more memory.
> > I would strongly recommend to give it more memory, what version of
> > lucene do you use? Depending on your setup you could run into a JVM
> > bug if you use a lucene version < 2.9. Your index is big enough
> > (document wise) that you norms file grows > 100MB, depending on your
> > Xmx settings this could trigger a false OOM during index open. So if
> > you are using < 2.9 check out this issue
> > https://issues.apache.org/jira/browse/LUCENE-1566
> > 
> > 
> >>
> >>> One abnormal thing is that I broke a running optimize of this index.
> >>> Is that can be a problem ?
> >>
> >> Possibly ...
> > In general, this should not be a problem. The optimize will not
> > destroy the index you are optimizing as segments are write once.
> >>
> >>> If so, how can I fix an index after optimize process is broken.
> >>
> >> Probably depends on what you mean by broken.  Start with running
> >> org.apache.lucene.index.CheckIndex.  That can also fix some things -
> >> but see the warning in the javadocs.
> > 100% recommended to make sure nothing is wrong! :)
> >>
> >>
> >> --
> >> Ian.
> > 
> 
> -- 
> View this message in context: 
> http://old.nabble.com/OutofMemory-in-large-index-tp26332397p26339388.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message