Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 25984 invoked from network); 7 Feb 2008 13:12:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Feb 2008 13:12:16 -0000 Received: (qmail 52350 invoked by uid 500); 7 Feb 2008 13:12:07 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 52318 invoked by uid 500); 7 Feb 2008 13:12:07 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 52307 invoked by uid 99); 7 Feb 2008 13:12:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 05:12:07 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [72.14.214.229] (HELO hu-out-0506.google.com) (72.14.214.229) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 13:11:34 +0000 Received: by hu-out-0506.google.com with SMTP id 27so5731928hub.15 for ; Thu, 07 Feb 2008 05:11:38 -0800 (PST) Received: by 10.78.182.17 with SMTP id e17mr11572232huf.8.1202389897802; Thu, 07 Feb 2008 05:11:37 -0800 (PST) Received: from ?10.17.4.4? ( [71.174.108.74]) by mx.google.com with ESMTPS id q9sm15840145gve.10.2008.02.07.05.11.36 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 07 Feb 2008 05:11:37 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v753) In-Reply-To: <47AB0035.7060708@gmail.com> References: <4821C29E-E2C9-490D-A964-6742F7095991@ix.netcom.com> <26C18E67-F7B6-4671-9975-C35FE55A45C4@gmail.com> <1710CED4-FAF8-43A6-8E5E-D188AF01185A@ix.netcom.com> <47AA45EC.50406@gmail.com> <123F94ED-9491-4844-A7BE-BCD976D7D393@gmail.com> <34B05080-F49F-4E21-BCBA-DF03DF9AB498@mikemccandless.com> <47AB0035.7060708@gmail.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <917B8583-3BEB-4642-BD6C-CECE5B69C4E8@mikemccandless.com> Content-Transfer-Encoding: 7bit From: Michael McCandless Subject: Re: detected corrupted index / performance improvement Date: Thu, 7 Feb 2008 08:12:15 -0500 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.753) X-Virus-Checked: Checked by ClamAV on apache.org Good idea; I'll call this ("if your hardware ignores the sync() call then you're in trouble") out in the javadocs with LUCENE-1044. Mike Mark Miller wrote: > We should really probably mention it in the JavaDoc when the issue > is done. I think both yonik and robert pointed it out, and ever > since then I have seen issues regarding it everywhere. > > http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252 > > Apparently, your just not ACID unless you have hardware you know is > properly reporting the sync call. > > Here is a good snippet from the h2database faq: http:// > www.h2database.com/html/frame.html?advanced.html% > 23durability_problems&main > > > Michael McCandless wrote: >> >> DM Smith wrote: >> >>> >>> On Feb 6, 2008, at 6:42 PM, Mark Miller wrote: >>> >>>> Hey DM, >>>> >>>> Just to recap an earlier thread, you need the sync and you need >>>> hardware that doesn't lie to you about the result of the sync. >>>> >>>> Here is an excerpt about Digg running into that issue: >>>> >>>> "They had problems with their storage system telling them writes >>>> were on disk when they really weren't. Controllers do this to >>>> improve the appearance of their performance. But what it does is >>>> leave a giant data integrity whole in failure scenarios. This is >>>> really a pretty common problem and can be hard to fix, depending >>>> on your hardware setup." >>>> >>>> There is a lot of good stuff relating to this in the discussion >>>> surrounding the JIRA issue. >>> >>> I guess I can take that dull tool out of my tool box. :( >>> >>> BTW, I followed the thread and the Jira discussion, but I missed >>> that. >> >> I too followed the thread & Jira discussion and missed this! >> >>>> >>>> >>>> robert engels wrote: >>>>> That doesn't help, with lazy writing/buffering by the OS, there >>>>> is no guarantee that if the last written block is ok, that >>>>> earlier blocks in the file are.... >>>>> >>>>> The OS/drive is going to physically write them in the most >>>>> efficient manner. Only after a sync would this hold true (which >>>>> is what we are trying to avoid). >>>>> >>>>> On Feb 6, 2008, at 5:15 PM, DM Smith wrote: >>>>> >>>>>> >>>>>> On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote: >>>>>> >>>>>>> >>>>>>> robert engels wrote: >>>>>>> >>>>>>>> Do we have any way of determining if a segment is definitely >>>>>>>> OK/VALID ? >>>>>>> >>>>>>> The only way I know is the CheckIndex tool, and it's rather >>>>>>> slow (and >>>>>>> it's not clear that it always catches all corruption). >>>>>> >>>>>> Just a thought. It seems that the discussion has revolved >>>>>> around whether a crash or similar event has left the file in >>>>>> an inconsistent state. Without looking into how it is actually >>>>>> done, I'm going to guess that the writing is done from the >>>>>> start of the file to its end. That is, no "out of order" writing. >>>>>> >>>>>> If this is the case, how about adding a marker to the end of >>>>>> the file of a known size and pattern. If it is present then it >>>>>> is presumed that there were no errors in getting to that point. >>>>>> >>>>>> Even with out of order writing, one could write an 'INVALID' >>>>>> marker at the beginning of the operation and then upon >>>>>> reaching the end of the writing, replace it with the valid >>>>>> marker. >>>>>> >>>>>> If neither marker is found then the index is one from before >>>>>> the capability was added and nothing can be said about the >>>>>> validity. >>>>>> >>>>>> -- DM >>>>>> >>>>>> ----------------------------------------------------------------- >>>>>> ---- >>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------ >>>>> --- >>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>>>> >>>>> >>>> >>>> ------------------------------------------------------------------- >>>> -- >>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>>> >>> >>> >>> -------------------------------------------------------------------- >>> - >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org