Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 40484 invoked from network); 20 Feb 2006 03:40:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 20 Feb 2006 03:40:28 -0000 Received: (qmail 74306 invoked by uid 500); 20 Feb 2006 03:40:22 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 74243 invoked by uid 500); 20 Feb 2006 03:40:21 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 74229 invoked by uid 99); 20 Feb 2006 03:40:21 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Feb 2006 19:40:21 -0800 X-ASF-Spam-Status: No, hits=2.3 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS,FORGED_YAHOO_RCVD X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [206.190.39.215] (HELO web50313.mail.yahoo.com) (206.190.39.215) by apache.org (qpsmtpd/0.29) with SMTP; Sun, 19 Feb 2006 19:40:20 -0800 Received: (qmail 38422 invoked by uid 60001); 20 Feb 2006 03:39:56 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=HIqgqiwp3NVkjP1cjyQllwH3X00JV3lddgTNqmYORYZWuhNVlzscKusajqc23d1sdu8K55/6R/U9OkmK+MrpFNWN1e/P9BFHktGXOnTN+cQqun39p56SwLONFrfPm/u1r+oPMXoSDrZwilwNljELqZKICdBtsY9AdTzegOyJ/u8= ; Message-ID: <20060220033956.38420.qmail@web50313.mail.yahoo.com> Date: Sun, 19 Feb 2006 19:39:56 -0800 (PST) From: Otis Gospodnetic Reply-To: Otis Gospodnetic Subject: Re: Index missing documents To: java-user@lucene.apache.org In-Reply-To: <013101c635a0$c40306d0$0102a8c0@loot.co.za> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N It is possible that your Documents were added to various index files, but those were not yet "registered" in the "segments" file. Lucene knows only about index segments that are listed in segments file. Any other files in the index directories are ignored. Also, some Documents are kept in memory while indexing (see maxBufferedDocs in IndexWriter), so if a power outage happened before they were written to disk, they would be lost, too. Otis ----- Original Message ---- From: Michael van Rooyen To: java-user@lucene.apache.org Sent: Sunday, February 19, 2006 5:06:42 PM Subject: Index missing documents While building a large index, we had a power outage. Over 2 million documents had been added, each document with up to about 20 fields. The size of the index on disk is ~500MB. When I started the process up again, I noticed that documents that should have been in the index were missing. In retrospect, I think that Lucene was seeing the index as being completely empty (it now says there are 385 docs in the index, but all of those have been added since the power outage). The size on disk is still ~500MB. Does anyone have an idea what might cause the documents to dissappear, and what can be done to get them back? Rebuilding takes a while at 100ms per document, but it's a bit more concerning if such a outage or crash could cause documents to mysteriously dissapear from the index... --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org