lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Index missing documents
Date Mon, 20 Feb 2006 03:39:56 GMT
It is possible that your Documents were added to various index files, but those were not yet
"registered" in the "segments" file.  Lucene knows only about index segments that are listed
in segments file.  Any other files in the index directories are ignored.

Also, some Documents are kept in memory while indexing (see maxBufferedDocs in IndexWriter),
so if a power outage happened before they were written to disk, they would be lost, too.

Otis

----- Original Message ----
From: Michael van Rooyen <mvanr@bigfoot.com>
To: java-user@lucene.apache.org
Sent: Sunday, February 19, 2006 5:06:42 PM
Subject: Index missing documents

While building a large index, we had a power outage.  Over 2 million 
documents had been added, each document with up to about 20 fields.  The 
size of the index on disk is ~500MB.  When I started the process up again, I 
noticed that documents that should have been in the index were missing.  In 
retrospect, I think that Lucene was seeing the index as being completely 
empty (it now says there are 385 docs in the index, but all of those have 
been added since the power outage).  The size on disk is still ~500MB.  Does 
anyone have an idea what might cause the documents to dissappear, and what 
can be done to get them back?  Rebuilding takes a while at 100ms per 
document, but it's a bit more concerning if such a outage or crash could 
cause documents to mysteriously dissapear from the index...


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message