lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <ysee...@gmail.com>
Subject Re: Any problems with a failed IndexWriter optimize call?
Date Mon, 01 Aug 2005 19:42:51 GMT
If all segments were flushed to the disk (no adds since the last time
the index writer was opened), then it seems like the index should be
fine.

The big question I have is what happens when there are in-memory
segments in the case of an OOM exception during an optimize?  Is data
loss possible?

-Yonik


On 8/1/05, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> 
> If i remember correctly, what you'll find when you remove the lock file is
> that your index is still usable, and from the perspective of new
> IndexWriter/IndexReaders it's in the same state it was prior to the call
> to optimize, but from the perspective of an external observer, the index
> directory will contain a bunch of garbage files from the aborted optimize.
> 
> At my work, we've taken the "safe" attitude that if you get an OOM
> exception, you should assume your index is corrupted, and rebuild from
> scratch for safety -- but i think it's safe to cleanup up the garbage
> files manually.
> 
> which brings up something i ment to ask a while back:  Has anyone written
> any index cleaning code?  that locks an index (using the public API) and
> then inspects the file (using the API, or using low level knowledge of hte
> file structure) to generate a list of 'garbage' files in the index
> directory that should be safely deletable?
> 
> (I considered writing this a few months ago, but then our "play it safe,
> treat it as corrupt" policy came out, and it wasn't all that neccessary
> for me)
> 
> It seems like it might be a handy addition to the sandbox.
> 
> 
> : Date: Sun, 31 Jul 2005 23:20:36 -0400
> : From: Robert Schultz <robert@cosmicrealms.com>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: Any problems with a failed IndexWriter optimize call?
> :
> : Hello! I am using Lucene 1.4.3
> :
> : I'm building a Lucene index, that will have about 25 million documents
> : when it is done.
> : I'm adding 250,000 at a time.
> :
> : Currently there is about 1.2Million in there, and I ran into a problem.
> : After I had added a batch of 250,000 I go a 'java.lang.outOfMemory'
> : threw by writer.optimize(); (a standard IndexWriter)
> :
> : The exception caused my program to quit out, and it didn't call
> : 'writer.close();'
> :
> : First, with it dying in the middle of an .optimize() is there any chance
> : my index is corrupted?
> :
> : Second, I know I can remove the /tmp/lucene*.lock file to remove the
> : lock in order to add more, but is it safe to do that?
> :
> : I've since figured out that I can pass -Xmx to the 'java' program in
> : order to increase the maximum amount of RAM.
> : It was using the default of 64M, I plan on increasing that to 175M to
> : start with.
> : That should solve the memory problems (I can allocate more if necessary
> : down the line).
> :
> : Lastly, when I go back, open it again, and add another 250,000 and then
> : call optimize again, will a failed previous optimize hurt the index at all?
> :
> :
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : For additional commands, e-mail: java-user-help@lucene.apache.org
> :
> 
> 
> 
> -Hoss
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message