lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Schultz <rob...@cosmicrealms.com>
Subject Re: Any problems with a failed IndexWriter optimize call?
Date Mon, 01 Aug 2005 20:12:40 GMT
I am going to play it safe.
I'm going to wipe the index files, and start over (only put about 2 days 
of processing time into it so far).

This time with a maximum memory of over 512MB

Yonik Seeley wrote:
> If all segments were flushed to the disk (no adds since the last time
> the index writer was opened), then it seems like the index should be
> fine.
> 
> The big question I have is what happens when there are in-memory
> segments in the case of an OOM exception during an optimize?  Is data
> loss possible?
> 
> -Yonik
> 
> 
> On 8/1/05, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> 
>>If i remember correctly, what you'll find when you remove the lock file is
>>that your index is still usable, and from the perspective of new
>>IndexWriter/IndexReaders it's in the same state it was prior to the call
>>to optimize, but from the perspective of an external observer, the index
>>directory will contain a bunch of garbage files from the aborted optimize.
>>
>>At my work, we've taken the "safe" attitude that if you get an OOM
>>exception, you should assume your index is corrupted, and rebuild from
>>scratch for safety -- but i think it's safe to cleanup up the garbage
>>files manually.
>>
>>which brings up something i ment to ask a while back:  Has anyone written
>>any index cleaning code?  that locks an index (using the public API) and
>>then inspects the file (using the API, or using low level knowledge of hte
>>file structure) to generate a list of 'garbage' files in the index
>>directory that should be safely deletable?
>>
>>(I considered writing this a few months ago, but then our "play it safe,
>>treat it as corrupt" policy came out, and it wasn't all that neccessary
>>for me)
>>
>>It seems like it might be a handy addition to the sandbox.
>>
>>
>>: Date: Sun, 31 Jul 2005 23:20:36 -0400
>>: From: Robert Schultz <robert@cosmicrealms.com>
>>: Reply-To: java-user@lucene.apache.org
>>: To: java-user@lucene.apache.org
>>: Subject: Any problems with a failed IndexWriter optimize call?
>>:
>>: Hello! I am using Lucene 1.4.3
>>:
>>: I'm building a Lucene index, that will have about 25 million documents
>>: when it is done.
>>: I'm adding 250,000 at a time.
>>:
>>: Currently there is about 1.2Million in there, and I ran into a problem.
>>: After I had added a batch of 250,000 I go a 'java.lang.outOfMemory'
>>: threw by writer.optimize(); (a standard IndexWriter)
>>:
>>: The exception caused my program to quit out, and it didn't call
>>: 'writer.close();'
>>:
>>: First, with it dying in the middle of an .optimize() is there any chance
>>: my index is corrupted?
>>:
>>: Second, I know I can remove the /tmp/lucene*.lock file to remove the
>>: lock in order to add more, but is it safe to do that?
>>:
>>: I've since figured out that I can pass -Xmx to the 'java' program in
>>: order to increase the maximum amount of RAM.
>>: It was using the default of 64M, I plan on increasing that to 175M to
>>: start with.
>>: That should solve the memory problems (I can allocate more if necessary
>>: down the line).
>>:
>>: Lastly, when I go back, open it again, and add another 250,000 and then
>>: call optimize again, will a failed previous optimize hurt the index at all?
>>:
>>:
>>:
>>: ---------------------------------------------------------------------
>>: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>: For additional commands, e-mail: java-user-help@lucene.apache.org
>>:
>>
>>
>>
>>-Hoss
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message