Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 92404 invoked from network); 5 Aug 2008 15:01:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 Aug 2008 15:01:03 -0000 Received: (qmail 92018 invoked by uid 500); 5 Aug 2008 15:01:00 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 91973 invoked by uid 500); 5 Aug 2008 15:00:59 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 91964 invoked by uid 99); 5 Aug 2008 15:00:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Aug 2008 08:00:59 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.132.243] (HELO an-out-0708.google.com) (209.85.132.243) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Aug 2008 15:00:03 +0000 Received: by an-out-0708.google.com with SMTP id c37so2835577anc.49 for ; Tue, 05 Aug 2008 08:00:11 -0700 (PDT) Received: by 10.100.201.9 with SMTP id y9mr1566600anf.60.1217948411690; Tue, 05 Aug 2008 08:00:11 -0700 (PDT) Received: from ?10.17.4.4? ( [72.93.214.93]) by mx.google.com with ESMTPS id w43sm7700544hsa.3.2008.08.05.08.00.10 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 05 Aug 2008 08:00:11 -0700 (PDT) Message-Id: From: Michael McCandless To: java-dev@lucene.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: CheckIndex tool Date: Tue, 5 Aug 2008 11:00:10 -0400 References: <0DE83926-039E-46FB-ABC2-4667EFC8B07F@apache.org> <83B86D4F-925B-4346-B470-99BA739B3DA6@mikemccandless.com> X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org Actually, those exceptions are thrown by the code detecting the mismatch, and then caught by CheckIndex and handled as meaning that segment is corrupt. This is consistent eg with how Lucene throws CorruptIndexException deep down if it hits an inconsistency. I think it's fine if you want to not use exceptions for the "local" mismatches, and instead record the error in a data structure and then stop processing that one segment. But for the "deep down" exceptions you still have to keep the catch in CheckIndex to record those. Mike On Aug 5, 2008, at 9:30 AM, Grant Ingersoll wrote: > I'll look into these. The other parts I am not sure on is the > throwing of exceptions for mismatches. I know they mean CheckIndex > can't go forward, but they aren't really errors in CheckIndex, so > much as errors in the index, which CheckIndex is just reporting. > So, I'm inclined to capture that and present it (and return > immediately) instead of throw an exception. Is that reasonable? > > -Grant > > > On Aug 4, 2008, at 5:01 PM, Michael McCandless wrote: > >> >> This sounds good! I like the idea of checking the index when Solr >> has to force release the write.lock. >> >> The one caveat is, when checking a large index (which can take >> quite some time), it'd be nice to have the equivalent of the >> inline'd out.print/ln calls happen in realtime so that you can see >> (on the command line output) that progress is being made, which >> segment is being checked, etc.? >> >> Maybe change it to an optional "infoStream" (like IndexWriter), and >> then the current inlined prints become calls to message() which >> checks if infoStream is non-null? >> >> Mike >> >> Grant Ingersoll wrote: >> >>> Hey Mike, >>> >>> I'm thinking about https://issues.apache.org/jira/browse/SOLR-566 >>> and was thinking about adding some more programmatic access to the >>> CheckIndex tool and wanted to see if you had any thoughts. >>> Basically, I am going to to capture info into a simple data >>> structure that can then be introspected and serialized into a >>> RequestHandler, but also something that might be more generally >>> useful in certain cases where things go bad. I was debating >>> keeping the inline out.printlns, but not sure if they shouldn't >>> just be moved to the main such that the cmd line stuff still works >>> as is, but it doesn't clog the logs for those that want >>> programmatic access. >>> >>> I'll post a patch soon, but wanted to see if you had any >>> preliminary insight. >>> >>> -Grant >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org