lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown
Date Mon, 26 Nov 2007 19:34:33 GMT

It's the "write" method in o.a.l.index.SegmentInfos

It's called from IndexWriter/DirectoryIndexReader.

Mike

"robert engels" <rengels@ix.netcom.com> wrote:
> Can you point me to the code that does the actual writing of the  
> SEGMENTS.XXX file?
> 
> On Nov 26, 2007, at 1:16 PM, Michael McCandless wrote:
> 
> >
> > This is correct.
> >
> > This just means the DeletionPolicy cannot delete a commit point until
> > all files referenced by a future (the next) commit point are done
> > being sync'd (DeletionPolicy needs to query the Directory to find out
> > which files are on stable storage).
> >
> > However before we even go there, I'm going to run perf tests of
> > background sync'ing to see if it can reduce the cost of syncing.
> >
> > Mike
> >
> > "robert engels" <rengels@ix.netcom.com> wrote:
> >> I am not sure all of this effort is going to work anyway...
> >>
> >> I think you need to sync all of the segment files, THEN write the
> >> segments.XXXX file and sync it.
> >>
> >> It does you no good if there is a valid segments.XXX file, but some
> >> of the dependent files may not have written successfully to disk.
> >>
> >> By the proper ordering of the operations
> >>
> >> write files
> >> sync files
> >> write SEGMENTS FILE with chceksum, and sync
> >>
> >> is the only way to be certain the index is not internally corrupt.
> >>
> >> I have not looked at closely, so it may already be doing this (but I
> >> don't think so).  Code is getting a bit hard to follow for me.
> >>
> >> On Nov 26, 2007, at 12:14 PM, Doug Cutting (JIRA) wrote:
> >>
> >>>
> >>>     [ https://issues.apache.org/jira/browse/LUCENE-1044?
> >>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> >>> tabpanel#action_12545535 ]
> >>>
> >>> Doug Cutting commented on LUCENE-1044:
> >>> --------------------------------------
> >>>
> >>>> I found out however that delaying the syncs (but intending to
> >>>> sync) also
> >>> means keeping the file handles open [...]
> >>>
> >>> Not necessarily.  You could just queue the file names for sync,
> >>> close them, and then have the background thread open, sync and
> >>> close them.  The close could trigger the OS to sync things faster
> >>> in the background.  Then the open/sync/close could mostly be a no-
> >>> op.  Might be worth a try.
> >>>
> >>>> Behavior on hard power shutdown
> >>>> -------------------------------
> >>>>
> >>>>                 Key: LUCENE-1044
> >>>>                 URL: https://issues.apache.org/jira/browse/
> >>>> LUCENE-1044
> >>>>             Project: Lucene - Java
> >>>>          Issue Type: Bug
> >>>>          Components: Index
> >>>>         Environment: Windows Server 2003, Standard Edition, Sun
> >>>> Hotspot Java 1.5
> >>>>            Reporter: venkat rangan
> >>>>            Assignee: Michael McCandless
> >>>>             Fix For: 2.3
> >>>>
> >>>>         Attachments: FSyncPerfTest.java, LUCENE-1044.patch,
> >>>> LUCENE-1044.take2.patch, LUCENE-1044.take3.patch
> >>>>
> >>>>
> >>>> When indexing a large number of documents, upon a hard power
> >>>> failure  (e.g. pull the power cord), the index seems to get
> >>>> corrupted. We start a Java application as an Windows Service, and
> >>>> feed it documents. In some cases (after an index size of 1.7GB,
> >>>> with 30-40 index segment .cfs files) , the following is observed.
> >>>> The 'segments' file contains only zeros. Its size is 265 bytes -
> >>>> all bytes are zeros.
> >>>> The 'deleted' file also contains only zeros. Its size is 85 bytes
> >>>> - all bytes are zeros.
> >>>> Before corruption, the segments file and deleted file appear to be
> >>>> correct. After this corruption, the index is corrupted and lost.
> >>>> This is a problem observed in Lucene 1.4.3. We are not able to
> >>>> upgrade our customer deployments to 1.9 or later version, but
> >>>> would be happy to back-port a patch, if the patch is small enough
> >>>> and if this problem is already solved.
> >>>
> >>> -- 
> >>> This message is automatically generated by JIRA.
> >>> -
> >>> You can reply to this email to add a comment to the issue online.
> >>>
> >>>
> >>> -------------------------------------------------------------------- 
> >>> -
> >>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message