lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: lucene index program won't start after power failure
Date Fri, 23 Sep 2016 18:54:26 GMT
The 'sync' option for an NFS client just means that every write is
sent immediately across the network.  And it really is useless
performance loss as long as your app (like Lucene) does the "right
thing" with fsync.

The more important question is why fsync sent to your NFS client and
then to the Mac Mini's NFS server failed to actually move all written
bytes to durable storage.

Can you reproduce this issue if you use a more well trodden IO system,
e.g. Linux with ext4 on a local IO device?

Mike McCandless

http://blog.mikemccandless.com

On Fri, Sep 23, 2016 at 12:00 AM, Ziming Dong <dzm1016397507@gmail.com> wrote:
> I use the macmini on NFS server side. It seems mount option sync  is
> useless, just slows down the index program.
>
> On Fri, Sep 23, 2016 at 4:43 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>> OK sorry I meant your first index, and it seems to have only one
>> (broken) segments file.  Can you post the "ls -l" output of that first
>> index?  It looks like the file was (illegally) filled with 0s, or at
>> least the first 4 bytes were.
>>
>> Lucene writes this file, fsyncs it, does an atomic rename, and fsyncs
>> the directory, so this should not happen, if your IO system honors
>> fsync.
>>
>> What IO devices are used by the NFS server?
>>
>> NFS is not well tested and has several known problems with Lucene so
>> this is already risky ground...
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Thu, Sep 22, 2016 at 11:33 AM, Ziming Dong <dzm1016397507@gmail.com>
>> wrote:
>> > second index is recovered by checkIndex, I don't know what are in second
>> > index directory before recover.
>> > checkIndex can't read first index. index filenames are attached.
>> > I use lucene6.0.0 at the beginning, then I upgrade to lucene6.1.0 to
>> > continue index.
>> >
>> > On Thu, Sep 22, 2016 at 10:17 PM, Michael McCandless
>> > <lucene@mikemccandless.com> wrote:
>> >>
>> >> Do you have 2 separate segments files in that 2nd index?
>> >>
>> >> Which exact Lucene version is this?
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Thu, Sep 22, 2016 at 7:44 AM, Ziming Dong <dzm1016397507@gmail.com>
>> >> wrote:
>> >> > I used checkIndex to recover second index though I lost many docs in
>> >> > index,
>> >> > but first index can't be read by checkIndex, error is
>> >> >
>> >> >> java -cp lucene-core-6.1.0.jar -ea:org.apache.lucene...
>> >> >> org.apache.lucene.index.CheckIndex
>> >> >> /Volumes/HPT8_56T/infomall-index/index0
>> >> >> Opening index @ /Volumes/HPT8_56T/infomall-index/index0
>> >> >> ERROR: could not read any segments file in directory
>> >> >> org.apache.lucene.index.IndexFormatTooOldException: Format version
>> >> >> is
>> >> >> not
>> >> >> supported (resource
>> >> >>
>> >> >>
>> >> >> BufferedChecksumIndexInput(MMapIndexInput(path="/Volumes/HPT8_56T/infomall-index/index0/segments_5t3"))):
>> >> >> 0 (needs to be between 1071082519 and 1071082519). This version
of
>> >> >> Lucene
>> >> >> only supports indexes created with release 5.0 and later.
>> >> >>         at
>> >> >>
>> >> >> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:295)
>> >> >>         at
>> >> >>
>> >> >> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:284)
>> >> >>         at
>> >> >> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:507)
>> >> >>         at
>> >> >> org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.java:2595)
>> >> >>         at
>> >> >> org.apache.lucene.index.CheckIndex.doMain(CheckIndex.java:2497)
>> >> >>         at
>> >> >> org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2423)
>> >> >
>> >> >
>> >> >  I use NFS, but I set mount option as  mount -t nfs -o
>> >> > tcp,sync,retrans=10
>> >> > The index program has run 1 month without any problem before power
>> >> > failure.
>> >> >
>> >> > On Thu, Sep 22, 2016 at 6:06 PM, Michael McCandless
>> >> > <lucene@mikemccandless.com> wrote:
>> >> >>
>> >> >> Hmm I'm no longer so sure this is an IW bug: on commit we fsync
the
>> >> >> pending_segments_N and then do an atomic rename to segments_N.
>> >> >>
>> >> >> Can you describe your IO system?  Is it possible it does not
>> >> >> implement
>> >> >> fsync or atomic renames correctly?
>> >> >>
>> >> >> Also, your 2nd exception indices the segments_N file was intact
but
>> >> >> the .cfs file was corrupt, which is also hard to explain unless
>> >> >> fsync
>> >> >> isn't working on your IO system.
>> >> >>
>> >> >> Mike McCandless
>> >> >>
>> >> >> http://blog.mikemccandless.com
>> >> >>
>> >> >> On Thu, Sep 22, 2016 at 5:10 AM, Michael McCandless
>> >> >> <lucene@mikemccandless.com> wrote:
>> >> >> > Sorry for the slow reply here.  Curious that both of these
>> >> >> > exceptions
>> >> >> > are from IW.init.  I think this may be a real bug, caused
by this:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://github.com/apache/lucene-solr/commit/981bfba841144d08df1d1a183d39fcd6f195ad56
>> >> >> >
>> >> >> > I'll see if I can make a standalone test case showing this.
>> >> >> >
>> >> >> > If you open those indices with an IndexReader instead, does
it
>> >> >> > succeed?
>> >> >> >
>> >> >> > If you run CheckIndex, what does it report?
>> >> >> >
>> >> >> > Mike McCandless
>> >> >> >
>> >> >> > http://blog.mikemccandless.com
>> >> >> >
>> >> >> > On Wed, Sep 14, 2016 at 1:22 AM, Ziming Dong
>> >> >> > <dzm1016397507@gmail.com>
>> >> >> > wrote:
>> >> >> >> I have 6 machine and 6 index directories, each machine
builds
>> >> >> >> index
>> >> >> >> into
>> >> >> >> one index directory. After power failure last night, two
of those
>> >> >> >> machine
>> >> >> >> can't start index program.
>> >> >> >>
>> >> >> >> one error is
>> >> >> >>
>> >> >> >>> INFO: 2016-09-14 12:31:38 [main]
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.InfomallIndexer$Builder:ignoreCollectionsFile(227):
>> >> >> >>> Loaded 2146 ignored collections from
>> >> >> >>> /mnt/HPT8_56T/infomall-index/index0/ignored_collections.txt
>> >> >> >>> ERROR: 2016-09-14 12:31:39 [main]
>> >> >> >>> sewm.bdbox.util.LogUtil:error(71):
>> >> >> >>> org.apache.lucene.index.IndexFormatTooOldException:
Format
>> >> >> >>> version
>> >> >> >>> is
>> >> >> >>> not
>> >> >> >>> supported (resource
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/HPT8_56T/infomall-index/index0/segments_5t3"))):
>> >> >> >>> 0 (needs to be between 1071082519 and 1071082519).
This version
>> >> >> >>> of
>> >> >> >>> Lucene
>> >> >> >>> only supports indexes created with release 5.0 and
later.
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:295)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:284)
>> >> >> >>>         at
>> >> >> >>> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:910)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.InfomallIndexer.<init>(InfomallIndexer.java:60)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>(ThreadedInfomallIndexer.java:28)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>(ThreadedInfomallIndexer.java:21)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer$Builder.build(ThreadedInfomallIndexer.java:72)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.main(ThreadedInfomallIndexer.java:129)
>> >> >> >>
>> >> >> >>
>> >> >> >> another is
>> >> >> >>
>> >> >> >> INFO: 2016-09-14 01:11:06 [main]
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.InfomallIndexer$Builder:ignoreCollectionsFile(227):
>> >> >> >>> Loaded 8575 ignored collections from
>> >> >> >>> /mnt/HPT8/infomall-index/index5/ignored_collections.txt
>> >> >> >>> ERROR: 2016-09-14 01:11:09 [main]
>> >> >> >>> sewm.bdbox.util.LogUtil:error(71):
>> >> >> >>> org.apache.lucene.index.CorruptIndexException: codec
footer
>> >> >> >>> mismatch
>> >> >> >>> (file
>> >> >> >>> truncated?): actual footer=0 vs expected footer=-1071082520
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> (resource=MMapIndexInput(path="/mnt/HPT8/infomall-index/index5/_1kqn.cfs"))
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:448)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.codecs.CodecUtil.retrieveChecksum(CodecUtil.java:433)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.<init>(Lucene50CompoundReader.java:86)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.getCompoundReader(Lucene50CompoundFormat.java:71)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.index.IndexWriter.readFieldInfos(IndexWriter.java:1016)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:1033)
>> >> >> >>>         at
>> >> >> >>> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:938)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.InfomallIndexer.<init>(InfomallIndexer.java:60)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>(ThreadedInfomallIndexer.java:28)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>(ThreadedInfomallIndexer.java:21)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer$Builder.build(ThreadedInfomallIndexer.java:72)
>> >> >> >>>         at
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.main(ThreadedInfomallIndexer.java:129)
>> >> >> >>>
>> >> >> >>
>> >> >> >>
>> >> >> >> it seems 1071082519 is a special number.
>> >> >> >>
>> >> >> >> - -
>> >> >> >>
>> >> >> >> Ziming Dong
>> >> >> >> *http://suiyuan2009.github.io/ <http://suiyuan2009.github.io/>*
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > Ziming Dong
>> >> > http://suiyuan2009.github.io/
>> >> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Ziming Dong
>> > http://suiyuan2009.github.io/
>> >
>
>
>
>
> --
>
> Ziming Dong
> http://suiyuan2009.github.io/
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message