lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: detected corrupted index / performance improvement
Date Thu, 07 Feb 2008 03:44:49 GMT
That is the problem, waiting for the full sync (of all of the segment  
files) takes quite a while... syncing a single log file is much more  
efficient.

On Feb 6, 2008, at 9:41 PM, Andrew Zhang wrote:

> On Feb 7, 2008 7:22 AM, robert engels <rengels@ix.netcom.com> wrote:
>
>> That doesn't help, with lazy writing/buffering by the OS, there is no
>> guarantee that if the last written block is ok, that earlier blocks
>> in the file are....
>>
>> The OS/drive is going to physically write them in the most efficient
>> manner. Only after a sync would this hold true (which is what we are
>> trying to avoid).
>
>
> Hi, how about asynchronous commit? i.e. use a thread to sync the data.
>
> We only need to make sure that all data are written to the storage  
> before
> the next operation?
>
>>
>>
>> On Feb 6, 2008, at 5:15 PM, DM Smith wrote:
>>
>>>
>>> On Feb 6, 2008, at 5:42 PM, Michael McCandless wrote:
>>>
>>>>
>>>> robert engels wrote:
>>>>
>>>>> Do we have any way of determining if a segment is definitely OK/
>>>>> VALID ?
>>>>
>>>> The only way I know is the CheckIndex tool, and it's rather slow  
>>>> (and
>>>> it's not clear that it always catches all corruption).
>>>
>>> Just a thought. It seems that the discussion has revolved around
>>> whether a crash or similar event has left the file in an
>>> inconsistent state. Without looking into how it is actually done,
>>> I'm going to guess that the writing is done from the start of the
>>> file to its end. That is, no "out of order" writing.
>>>
>>> If this is the case, how about adding a marker to the end of the
>>> file of a known size and pattern. If it is present then it is
>>> presumed that there were no errors in getting to that point.
>>>
>>> Even with out of order writing, one could write an 'INVALID' marker
>>> at the beginning of the operation and then upon reaching the end of
>>> the writing, replace it with the valid marker.
>>>
>>> If neither marker is found then the index is one from before the
>>> capability was added and nothing can be said about the validity.
>>>
>>> -- DM
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
> -- 
> Best regards,
> Andrew Zhang
>
> db4o - database for Android: www.db4o.com
> http://zhanghuangzhu.blogspot.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message