lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Keegan <peterlkee...@gmail.com>
Subject Re: IO exception during merge/optimize
Date Tue, 27 Oct 2009 14:44:59 GMT
On Tue, Oct 27, 2009 at 10:37 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> OK that exception looks more reasonable, for a disk full event.
>
> But, I can't tell from your followon emails: did this lead to index
> corruption?
>

Yes, but this may be caused by the application ignoring a Lucene exception
somewhere else. I will chase this down.

>
> Also, I noticed you're using a rather old 1.6.0 JRE (1.6.0_03) -- you
> really should upgrade that to the latest 1.6.0 -- there's at least one
> known problem with Lucene and early 1.6.0 JREs.
>

Yes, I remember this problem - that's why we stayed at _03
Thanks.

>
> Mike
>
> On Tue, Oct 27, 2009 at 10:00 AM, Peter Keegan <peterlkeegan@gmail.com>
> wrote:
> > After rebuilding the corrupted indexes, the low disk space exception is
> now
> > occurring as expected. Sorry for the distraction.
> >
> > fyi, here are the details:
> >
> >  java.io.IOException: There is not enough space on the disk
> >    at java.io.RandomAccessFile.writeBytes(Native Method)
> >    at java.io.RandomAccessFile.write(Unknown Source)
> >    at
> >
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
> >    at
> >
> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
> >    at
> >
> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
> >    at
> >
> org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
> >    at
> >
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.close(SimpleFSDirectory.java:199)
> >    at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:144)
> >    at
> > org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:357)
> >    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
> >    at
> > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5011)
> >    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4596)
> >    at
> >
> org.apache.lucene.index.IndexWriter.resolveExternalSegments(IndexWriter.java:3786)
> >    at
> >
> org.apache.lucene.index.IndexWriter.addIndexesNoOptimize(IndexWriter.java:3695)
> >
> >
> > And the corresponding index info log:
> >
> > IFD [Indexer]: setInfoStream
> >
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@256ef705
> > IW 1 [Indexer]: setInfoStream:
> > dir=org.apache.lucene.store.SimpleFSDirectory@D
> :\mnsavs\lresumes3\lresumes3.luc\lresumes3.update.main.4
> > autoCommit=false
> >
> mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@181b7c76mergeScheduler
> =org.apache.lucene.index.ConcurrentMergeScheduler@34883357ramBufferSizeMB
> =16.0
> > maxBufferedDocs=-1 maxBuffereDeleteTerms=-1
> > maxFieldLength=2147483647 index=
> > IW 1 [Indexer]: flush at addIndexesNoOptimize
> > IW 1 [Indexer]:   flush: segment=null docStoreSegment=null
> docStoreOffset=0
> > flushDocs=false flushDeletes=true flushDocStores=false numDocs=0
> > numBufDelTerms=0
> > IW 1 [Indexer]:   index before flush
> > IW 1 [Indexer]: now start transaction
> > IW 1 [Indexer]: LMP: findMerges: 2 segments
> > IW 1 [Indexer]: LMP:   level 8.774518 to 9.524518: 1 segments
> > IW 1 [Indexer]: LMP:   level 6.2973914 to 7.0473914: 1 segments
> > IW 1 [Indexer]: CMS: now merge
> > IW 1 [Indexer]: CMS:   index: _7:Cx1075533->_0** _8:Cx2795**
> > IW 1 [Indexer]: CMS:   no more merges pending; now return
> > IW 1 [Indexer]: add merge to pendingMerges: _7:Cx1075533->_0 [total 1
> > pending]
> > IW 1 [Indexer]: now merge
> >  merge=_7:Cx1075533->_0 into _0 [mergeDocStores]
> >  merge=org.apache.lucene.index.MergePolicy$OneMerge@4d480ea
> >  index=_7:Cx1075533->_0** _8:Cx2795**
> > IW 1 [Indexer]: merging _7:Cx1075533->_0 into _0 [mergeDocStores]
> > IW 1 [Indexer]: merge: total 1074388 docs
> > IW 1 [Indexer]: commitMerge: _7:Cx1075533->_0 into _0 [mergeDocStores]
> > index=_7:Cx1075533->_0** _8:Cx2795**
> > IW 1 [Indexer]: commitMergeDeletes _7:Cx1075533->_0 into _0
> [mergeDocStores]
> > IFD [Indexer]: now checkpoint "segments_1" [2 segments ; isCommit =
> false]
> > IW 1 [Indexer]: LMP: findMerges: 2 segments
> > IW 1 [Indexer]: LMP:   level 8.864886 to 9.614886: 1 segments
> > IW 1 [Indexer]: LMP:   level 6.2973914 to 7.0473914: 1 segments
> > IW 1 [Indexer]: add merge to pendingMerges: _8:Cx2795 [total 1 pending]
> > IW 1 [Indexer]: now merge
> >  merge=_8:Cx2795 into _1 [mergeDocStores]
> >  merge=org.apache.lucene.index.MergePolicy$OneMerge@606f8b2b
> >  index=_0:C1074388 _8:Cx2795**
> > IW 1 [Indexer]: merging _8:Cx2795 into _1 [mergeDocStores]
> > IW 1 [Indexer]: merge: total 2795 docs
> > IW 1 [Indexer]: handleMergeException: merge=_8:Cx2795 into _1
> > [mergeDocStores] exc=java.io.IOException: There is not enough space on
> the
> > disk
> > IW 1 [Indexer]: hit exception during merge
> > IFD [Indexer]: refresh [prefix=_1]: removing newly created unreferenced
> file
> > "_1.fdt"
> > IFD [Indexer]: delete "_1.fdt"
> > IFD [Indexer]: refresh [prefix=_1]: removing newly created unreferenced
> file
> > "_1.fdx"
> > IFD [Indexer]: delete "_1.fdx"
> > IFD [Indexer]: refresh [prefix=_1]: removing newly created unreferenced
> file
> > "_1.fnm"
> > IFD [Indexer]: delete "_1.fnm"
> > IW 1 [Indexer]: now rollback transaction
> > IW 1 [Indexer]: all running merges have aborted
> > IFD [Indexer]: now checkpoint "segments_1" [0 segments ; isCommit =
> false]
> > IFD [Indexer]: delete "_0.nrm"
> > IFD [Indexer]: delete "_0.tis"
> > IFD [Indexer]: delete "_0.fnm"
> > IFD [Indexer]: delete "_0.tii"
> > IFD [Indexer]: delete "_0.frq"
> > IFD [Indexer]: delete "_0.fdx"
> > IFD [Indexer]: delete "_0.prx"
> > IFD [Indexer]: delete "_0.fdt"
> >
> >
> > Peter
> >
> > On Mon, Oct 26, 2009 at 3:59 PM, Peter Keegan <peterlkeegan@gmail.com
> >wrote:
> >
> >>
> >>
> >> On Mon, Oct 26, 2009 at 3:00 PM, Michael McCandless <
> >> lucene@mikemccandless.com> wrote:
> >>
> >>> On Mon, Oct 26, 2009 at 2:55 PM, Peter Keegan <peterlkeegan@gmail.com>
> >>> wrote:
> >>> > On Mon, Oct 26, 2009 at 2:50 PM, Michael McCandless <
> >>> > lucene@mikemccandless.com> wrote:
> >>> >
> >>> >> On Mon, Oct 26, 2009 at 10:44 AM, Peter Keegan <
> peterlkeegan@gmail.com
> >>> >
> >>> >> wrote:
> >>> >> > Even running in console mode, the exception is difficult to
> >>> interpret.
> >>> >> > Here's an exception that I think occurred during an add document,
> >>> commit
> >>> >> or
> >>> >> > close:
> >>> >> > doc counts differ for segment _g: field Reader shows 137 but
> >>> segmentInfo
> >>> >> > shows 5777
> >>> >>
> >>> >> That's spooky.  Do you have the full exception for this one?  What
> IO
> >>> >> system are you running on?  (Is it just a local drive on your
> windows
> >>> >> computer?) It's almost as if the IO system is not generating an
> >>> >> IOException to Java when disk fills up.
> >>> >>
> >>> >
> >>> > Index and code are all on a local drive. There is no other exception
> >>> coming
> >>> > back - just what I reported.
> >>>
> >>> But, you didn't report a traceback for this first one?
> >>>
> >>
> >> Yes, I need to add some more printStackTrace calls.
> >>
> >>
> >>>
> >>> >> > I ensured that the disk space was low before updating the
index.
> >>> >>
> >>> >> You mean, to intentionally test the disk-full case?
> >>> >>
> >>> >
> >>> > Yes, that's right.
> >>>
> >>> OK.  Can you turn on IndexWriter's infoStream, get this disk full /
> >>> corruption to happen again, and post back the resulting output?  Make
> >>> sure your index first passes CheckIndex before starting (so we don't
> >>> begin the test w/ any pre-existing index corruption).
> >>>
> >>
> >> Good point about CheckIndex.  I've already found 2 bad ones. I will
> build
> >> new indexes from scratch. This will take a while.
> >>
> >>
> >>> >> > On another occasion, the exception was:
> >>> >> > background merge hit exception: _0:C1080260 _1:C139 _2:C123
> _3:C107
> >>> >> _4:C126
> >>> >> > _5:C121 _6:C126 _7:C711 _8:C116 into _9 [optimize]
> [mergeDocStores]
> >>> >>
> >>> >> In this case, the SegmentMerger was trying to open this segment,
but
> >>> >> on attempting to read the first int from the fdx (fields index)
file
> >>> >> for one of the segments, it hit EOF.
> >>> >>
> >>> >> This is also spooky -- this looks like index corruption, which
> should
> >>> >> never happen on hitting disk full.
> >>> >>
> >>> >
> >>> > That's what I thought, too. Could Lucene be catching the IOException
> and
> >>> > turning it into a different exception?
> >>>
> >>> I think that's unlikely, but I guess possible.  We have "disk full"
> >>> tests in the unit tests, that throw an IOException at different times.
> >>>
> >>> What exact windows version are you using?  The local drive is NTFS?
> >>>
> >>
> >> Windows Server 2003 Enterprise x64 SP2. Local drive is NTFS
> >>
> >>
> >>>
> >>> Mike
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message