lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: detected corrupted index / performance improvement
Date Thu, 07 Feb 2008 13:12:15 GMT

Good idea; I'll call this ("if your hardware ignores the sync() call  
then you're in trouble") out in the javadocs with LUCENE-1044.

Mike

Mark Miller wrote:

> We should really probably mention it in the JavaDoc when the issue  
> is done. I think both yonik and robert pointed it out, and ever  
> since then I have seen issues regarding it everywhere.
>
> http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252
>
> Apparently, your just not ACID unless you have hardware you know is  
> properly reporting the sync call.
>
> Here is a good snippet from the h2database faq: http:// 
> www.h2database.com/html/frame.html?advanced.html% 
> 23durability_problems&main
>
>
> Michael McCandless wrote:
>>
>> DM Smith wrote:
>>
>>>
>>> On Feb 6, 2008, at 6:42 PM, Mark Miller wrote:
>>>
>>>> Hey DM,
>>>>
>>>> Just to recap an earlier thread, you need the sync and you need  
>>>> hardware that doesn't lie to you about the result of the sync.
>>>>
>>>> Here is an excerpt about Digg running into that issue:
>>>>
>>>> "They had problems with their storage system telling them writes  
>>>> were on disk when they really weren't. Controllers do this to  
>>>> improve the appearance of their performance. But what it does is  
>>>> leave a giant data integrity whole in failure scenarios. This is  
>>>> really a pretty common problem and can be hard to fix, depending  
>>>> on your hardware setup."
>>>>
>>>> There is a lot of good stuff relating to this in the discussion  
>>>> surrounding the JIRA issue.
>>>
>>> I guess I can take that dull tool out of my tool box. :(
>>>
>>> BTW, I followed the thread and the Jira discussion, but I missed  
>>> that.
>>
>> I too followed the thread & Jira discussion and missed this!
>>
>>>>
>>>>
>>>> robert engels wrote:
>>>>> That doesn't help, with lazy writing/buffering by the OS, there  
>>>>> is no guarantee that if the last written block is ok, that  
>>>>> earlier blocks in the file are....
>>>>>
>>>>> The OS/drive is going to physically write them in the most  
>>>>> efficient manner. Only after a sync would this hold true (which  
>>>>> is what we are trying to avoid).
>>>>>
>>>>> On Feb 6, 2008, at 5:15 PM, DM Smith wrote:
>>>>>
>>>>>>
>>>>>> On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote:
>>>>>>
>>>>>>>
>>>>>>> robert engels wrote:
>>>>>>>
>>>>>>>> Do we have any way of determining if a segment is definitely
 
>>>>>>>> OK/VALID ?
>>>>>>>
>>>>>>> The only way I know is the CheckIndex tool, and it's rather 

>>>>>>> slow (and
>>>>>>> it's not clear that it always catches all corruption).
>>>>>>
>>>>>> Just a thought. It seems that the discussion has revolved  
>>>>>> around whether a crash or similar event has left the file in  
>>>>>> an inconsistent state. Without looking into how it is actually  
>>>>>> done, I'm going to guess that the writing is done from the  
>>>>>> start of the file to its end. That is, no "out of order" writing.
>>>>>>
>>>>>> If this is the case, how about adding a marker to the end of  
>>>>>> the file of a known size and pattern. If it is present then it  
>>>>>> is presumed that there were no errors in getting to that point.
>>>>>>
>>>>>> Even with out of order writing, one could write an 'INVALID'  
>>>>>> marker at the beginning of the operation and then upon  
>>>>>> reaching the end of the writing, replace it with the valid  
>>>>>> marker.
>>>>>>
>>>>>> If neither marker is found then the index is one from before  
>>>>>> the capability was added and nothing can be said about the  
>>>>>> validity.
>>>>>>
>>>>>> -- DM
>>>>>>
>>>>>> -----------------------------------------------------------------

>>>>>> ----
>>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> ---
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message