Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 57802 invoked from network); 6 Feb 2008 23:21:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Feb 2008 23:21:31 -0000 Received: (qmail 46589 invoked by uid 500); 6 Feb 2008 23:21:20 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 46520 invoked by uid 500); 6 Feb 2008 23:21:20 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 46485 invoked by uid 99); 6 Feb 2008 23:21:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2008 15:21:20 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.86.89.68] (HELO elasmtp-masked.atl.sa.earthlink.net) (209.86.89.68) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2008 23:20:49 +0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=ix.netcom.com; b=X69bOS2RJT1PhLmpjTn864poEBITfv91L3GEwj5ziTAfEBsoGQS7BnzlVTCBv8eN; h=Received:Mime-Version:In-Reply-To:References:Content-Type:Message-Id:Content-Transfer-Encoding:From:Subject:Date:To:X-Mailer:X-ELNK-Trace:X-Originating-IP; Received: from [76.223.30.107] (helo=[192.168.1.64]) by elasmtp-masked.atl.sa.earthlink.net with asmtp (Exim 4.34) id 1JMtZW-0005hM-P5 for java-dev@lucene.apache.org; Wed, 06 Feb 2008 18:20:54 -0500 Mime-Version: 1.0 (Apple Message framework v753) In-Reply-To: References: <4821C29E-E2C9-490D-A964-6742F7095991@ix.netcom.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <7DCAA530-5FF5-4E8B-A6C4-17952FABB74E@ix.netcom.com> Content-Transfer-Encoding: 7bit From: robert engels Subject: Re: detected corrupted index / performance improvement Date: Wed, 6 Feb 2008 17:20:53 -0600 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.753) X-ELNK-Trace: 33cbdd8ed9881ca8776432462e451d7b7f19f0d9c038d9aa941120be2236d19b1078c45e0865c12e350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c X-Originating-IP: 76.223.30.107 X-Virus-Checked: Checked by ClamAV on apache.org Yes, but this pruning could be more efficient. On a background thread, get current segment from segments file, call the system wide sync ( e.g. System.exec("fsync"), then you can purge the transaction logs for all segments up to that one. Since it is a background operation, you are not blocking the writing of new segments and tx logs. On Feb 6, 2008, at 4:42 PM, Michael McCandless wrote: > > robert engels wrote: > >> Do we have any way of determining if a segment is definitely OK/ >> VALID ? > > The only way I know is the CheckIndex tool, and it's rather slow (and > it's not clear that it always catches all corruption). > >> If so, a much more efficient transactional system could be developed. >> >> Serialize the updates to a log file. Sync the log. Update the >> lucene index WITHOUT any sync. Log file writing/sync is VERY >> efficient since it is sequential, and a single file. >> >> Upon open of the index, detect if index was not shutdown cleanly. >> If so, determine the last valid segment, delete the bad segments, >> and then perform the updates (from the log file) since the last >> valid segment was written. >> >> The detection could be a VERY slow operation, but this is ok, >> since it should be rare, and then you will only pay this price on >> the rare occasion, not on every update. > > Wouldn't you still need to sync periodically, so you can prune the > transaction log? Else your transaction log is growing as fast as the > index? (You've doubled disk usage). > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org