lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ziming Dong <dzm1016397...@gmail.com>
Subject Re: lucene index program won't start after power failure
Date Thu, 29 Sep 2016 09:31:33 GMT
do you mean `http://www.getopt.org/luke/`?

On Mon, Sep 26, 2016 at 4:58 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> It is in theory possible to reconstruct a segments file by ls-ing all
> other index files and manually rebuilding it but it is not an easy
> task and it would have to make some guesses.
>
> I think in the past a user did manage to create such a tool and maybe
> posted the results here either on this list or the dev list?
>
> The segments file is a vital file to the index.  It holds all metadata
> about the index segments.  This is why Lucene is so careful about
> writing a new one to a "pending" file, fsync'ing that, fsyncing the
> directory, and doing an atomic rename, all before removing the older
> segment files.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, Sep 25, 2016 at 10:37 AM, Ziming Dong <dzm1016397507@gmail.com>
> wrote:
> > sorry to resend.
> > I'll change IO to local. Is there anyway to recover first index? now it
> can
> > not be opened by checkIndex, we are building index of 7 billion
> webpages, it
> > costs much time to rebuild.
> >
> > On Sun, Sep 25, 2016 at 5:31 PM, Ziming Dong <dzm1016397507@gmail.com>
> > wrote:
> >>
> >> I'll change IO to local. Is there anyway to recover first index? now it
> >> can be opened by checkIndex, we are building index of 7 billion
> webpages, it
> >> costs much time to rebuild.
> >>
> >> On Sat, Sep 24, 2016 at 2:54 AM, Michael McCandless
> >> <lucene@mikemccandless.com> wrote:
> >>>
> >>> The 'sync' option for an NFS client just means that every write is
> >>> sent immediately across the network.  And it really is useless
> >>> performance loss as long as your app (like Lucene) does the "right
> >>> thing" with fsync.
> >>>
> >>> The more important question is why fsync sent to your NFS client and
> >>> then to the Mac Mini's NFS server failed to actually move all written
> >>> bytes to durable storage.
> >>>
> >>> Can you reproduce this issue if you use a more well trodden IO system,
> >>> e.g. Linux with ext4 on a local IO device?
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>>
> >>> On Fri, Sep 23, 2016 at 12:00 AM, Ziming Dong <dzm1016397507@gmail.com
> >
> >>> wrote:
> >>> > I use the macmini on NFS server side. It seems mount option sync  is
> >>> > useless, just slows down the index program.
> >>> >
> >>> > On Fri, Sep 23, 2016 at 4:43 AM, Michael McCandless
> >>> > <lucene@mikemccandless.com> wrote:
> >>> >>
> >>> >> OK sorry I meant your first index, and it seems to have only one
> >>> >> (broken) segments file.  Can you post the "ls -l" output of that
> first
> >>> >> index?  It looks like the file was (illegally) filled with 0s,
or at
> >>> >> least the first 4 bytes were.
> >>> >>
> >>> >> Lucene writes this file, fsyncs it, does an atomic rename, and
> fsyncs
> >>> >> the directory, so this should not happen, if your IO system honors
> >>> >> fsync.
> >>> >>
> >>> >> What IO devices are used by the NFS server?
> >>> >>
> >>> >> NFS is not well tested and has several known problems with Lucene
so
> >>> >> this is already risky ground...
> >>> >>
> >>> >> Mike McCandless
> >>> >>
> >>> >> http://blog.mikemccandless.com
> >>> >>
> >>> >> On Thu, Sep 22, 2016 at 11:33 AM, Ziming Dong
> >>> >> <dzm1016397507@gmail.com>
> >>> >> wrote:
> >>> >> > second index is recovered by checkIndex, I don't know what
are in
> >>> >> > second
> >>> >> > index directory before recover.
> >>> >> > checkIndex can't read first index. index filenames are attached.
> >>> >> > I use lucene6.0.0 at the beginning, then I upgrade to lucene6.1.0
> to
> >>> >> > continue index.
> >>> >> >
> >>> >> > On Thu, Sep 22, 2016 at 10:17 PM, Michael McCandless
> >>> >> > <lucene@mikemccandless.com> wrote:
> >>> >> >>
> >>> >> >> Do you have 2 separate segments files in that 2nd index?
> >>> >> >>
> >>> >> >> Which exact Lucene version is this?
> >>> >> >>
> >>> >> >> Mike McCandless
> >>> >> >>
> >>> >> >> http://blog.mikemccandless.com
> >>> >> >>
> >>> >> >>
> >>> >> >> On Thu, Sep 22, 2016 at 7:44 AM, Ziming Dong
> >>> >> >> <dzm1016397507@gmail.com>
> >>> >> >> wrote:
> >>> >> >> > I used checkIndex to recover second index though
I lost many
> docs
> >>> >> >> > in
> >>> >> >> > index,
> >>> >> >> > but first index can't be read by checkIndex, error
is
> >>> >> >> >
> >>> >> >> >> java -cp lucene-core-6.1.0.jar -ea:org.apache.lucene...
> >>> >> >> >> org.apache.lucene.index.CheckIndex
> >>> >> >> >> /Volumes/HPT8_56T/infomall-index/index0
> >>> >> >> >> Opening index @ /Volumes/HPT8_56T/infomall-index/index0
> >>> >> >> >> ERROR: could not read any segments file in directory
> >>> >> >> >> org.apache.lucene.index.IndexFormatTooOldException:
Format
> >>> >> >> >> version
> >>> >> >> >> is
> >>> >> >> >> not
> >>> >> >> >> supported (resource
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> BufferedChecksumIndexInput(MMapIndexInput(path="/Volumes/
> HPT8_56T/infomall-index/index0/segments_5t3"))):
> >>> >> >> >> 0 (needs to be between 1071082519 and 1071082519).
This
> version
> >>> >> >> >> of
> >>> >> >> >> Lucene
> >>> >> >> >> only supports indexes created with release 5.0
and later.
> >>> >> >> >>         at
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> org.apache.lucene.index.SegmentInfos.readCommit(
> SegmentInfos.java:295)
> >>> >> >> >>         at
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> org.apache.lucene.index.SegmentInfos.readCommit(
> SegmentInfos.java:284)
> >>> >> >> >>         at
> >>> >> >> >>
> >>> >> >> >> org.apache.lucene.index.CheckIndex.checkIndex(
> CheckIndex.java:507)
> >>> >> >> >>         at
> >>> >> >> >> org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.
> java:2595)
> >>> >> >> >>         at
> >>> >> >> >> org.apache.lucene.index.CheckIndex.doMain(CheckIndex.
> java:2497)
> >>> >> >> >>         at
> >>> >> >> >> org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2423)
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >  I use NFS, but I set mount option as  mount -t nfs
-o
> >>> >> >> > tcp,sync,retrans=10
> >>> >> >> > The index program has run 1 month without any problem
before
> >>> >> >> > power
> >>> >> >> > failure.
> >>> >> >> >
> >>> >> >> > On Thu, Sep 22, 2016 at 6:06 PM, Michael McCandless
> >>> >> >> > <lucene@mikemccandless.com> wrote:
> >>> >> >> >>
> >>> >> >> >> Hmm I'm no longer so sure this is an IW bug:
on commit we
> fsync
> >>> >> >> >> the
> >>> >> >> >> pending_segments_N and then do an atomic rename
to segments_N.
> >>> >> >> >>
> >>> >> >> >> Can you describe your IO system?  Is it possible
it does not
> >>> >> >> >> implement
> >>> >> >> >> fsync or atomic renames correctly?
> >>> >> >> >>
> >>> >> >> >> Also, your 2nd exception indices the segments_N
file was
> intact
> >>> >> >> >> but
> >>> >> >> >> the .cfs file was corrupt, which is also hard
to explain
> unless
> >>> >> >> >> fsync
> >>> >> >> >> isn't working on your IO system.
> >>> >> >> >>
> >>> >> >> >> Mike McCandless
> >>> >> >> >>
> >>> >> >> >> http://blog.mikemccandless.com
> >>> >> >> >>
> >>> >> >> >> On Thu, Sep 22, 2016 at 5:10 AM, Michael McCandless
> >>> >> >> >> <lucene@mikemccandless.com> wrote:
> >>> >> >> >> > Sorry for the slow reply here.  Curious
that both of these
> >>> >> >> >> > exceptions
> >>> >> >> >> > are from IW.init.  I think this may be a
real bug, caused by
> >>> >> >> >> > this:
> >>> >> >> >> >
> >>> >> >> >> >
> >>> >> >> >> >
> >>> >> >> >> >
> >>> >> >> >> > https://github.com/apache/lucene-solr/commit/
> 981bfba841144d08df1d1a183d39fcd6f195ad56
> >>> >> >> >> >
> >>> >> >> >> > I'll see if I can make a standalone test
case showing this.
> >>> >> >> >> >
> >>> >> >> >> > If you open those indices with an IndexReader
instead, does
> it
> >>> >> >> >> > succeed?
> >>> >> >> >> >
> >>> >> >> >> > If you run CheckIndex, what does it report?
> >>> >> >> >> >
> >>> >> >> >> > Mike McCandless
> >>> >> >> >> >
> >>> >> >> >> > http://blog.mikemccandless.com
> >>> >> >> >> >
> >>> >> >> >> > On Wed, Sep 14, 2016 at 1:22 AM, Ziming
Dong
> >>> >> >> >> > <dzm1016397507@gmail.com>
> >>> >> >> >> > wrote:
> >>> >> >> >> >> I have 6 machine and 6 index directories,
each machine
> builds
> >>> >> >> >> >> index
> >>> >> >> >> >> into
> >>> >> >> >> >> one index directory. After power failure
last night, two of
> >>> >> >> >> >> those
> >>> >> >> >> >> machine
> >>> >> >> >> >> can't start index program.
> >>> >> >> >> >>
> >>> >> >> >> >> one error is
> >>> >> >> >> >>
> >>> >> >> >> >>> INFO: 2016-09-14 12:31:38 [main]
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.InfomallIndexer$Builder:
> ignoreCollectionsFile(227):
> >>> >> >> >> >>> Loaded 2146 ignored collections
from
> >>> >> >> >> >>> /mnt/HPT8_56T/infomall-index/
> index0/ignored_collections.txt
> >>> >> >> >> >>> ERROR: 2016-09-14 12:31:39 [main]
> >>> >> >> >> >>> sewm.bdbox.util.LogUtil:error(71):
> >>> >> >> >> >>> org.apache.lucene.index.IndexFormatTooOldException:
> Format
> >>> >> >> >> >>> version
> >>> >> >> >> >>> is
> >>> >> >> >> >>> not
> >>> >> >> >> >>> supported (resource
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/
> HPT8_56T/infomall-index/index0/segments_5t3"))):
> >>> >> >> >> >>> 0 (needs to be between 1071082519
and 1071082519). This
> >>> >> >> >> >>> version
> >>> >> >> >> >>> of
> >>> >> >> >> >>> Lucene
> >>> >> >> >> >>> only supports indexes created with
release 5.0 and later.
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.index.SegmentInfos.readCommit(
> SegmentInfos.java:295)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.index.SegmentInfos.readCommit(
> SegmentInfos.java:284)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.index.IndexWriter.<init>(
> IndexWriter.java:910)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.InfomallIndexer.<init>(
> InfomallIndexer.java:60)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>
> (ThreadedInfomallIndexer.java:28)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>
> (ThreadedInfomallIndexer.java:21)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer$Builder.build(
> ThreadedInfomallIndexer.java:72)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.main(
> ThreadedInfomallIndexer.java:129)
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> another is
> >>> >> >> >> >>
> >>> >> >> >> >> INFO: 2016-09-14 01:11:06 [main]
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.InfomallIndexer$Builder:
> ignoreCollectionsFile(227):
> >>> >> >> >> >>> Loaded 8575 ignored collections
from
> >>> >> >> >> >>> /mnt/HPT8/infomall-index/index5/ignored_collections.txt
> >>> >> >> >> >>> ERROR: 2016-09-14 01:11:09 [main]
> >>> >> >> >> >>> sewm.bdbox.util.LogUtil:error(71):
> >>> >> >> >> >>> org.apache.lucene.index.CorruptIndexException:
codec
> footer
> >>> >> >> >> >>> mismatch
> >>> >> >> >> >>> (file
> >>> >> >> >> >>> truncated?): actual footer=0 vs
expected
> footer=-1071082520
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> (resource=MMapIndexInput(path="/mnt/HPT8/infomall-index/
> index5/_1kqn.cfs"))
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.codecs.CodecUtil.validateFooter(
> CodecUtil.java:448)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.codecs.CodecUtil.retrieveChecksum(
> CodecUtil.java:433)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.codecs.lucene50.
> Lucene50CompoundReader.<init>(Lucene50CompoundReader.java:86)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.
> getCompoundReader(Lucene50CompoundFormat.java:71)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.index.IndexWriter.readFieldInfos(
> IndexWriter.java:1016)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.index.IndexWriter.getFieldNumberMap(
> IndexWriter.java:1033)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>> org.apache.lucene.index.IndexWriter.<init>(
> IndexWriter.java:938)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.InfomallIndexer.<init>(
> InfomallIndexer.java:60)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>
> (ThreadedInfomallIndexer.java:28)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.<init>
> (ThreadedInfomallIndexer.java:21)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer$Builder.build(
> ThreadedInfomallIndexer.java:72)
> >>> >> >> >> >>>         at
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> sewm.bdbox.search.ThreadedInfomallIndexer.main(
> ThreadedInfomallIndexer.java:129)
> >>> >> >> >> >>>
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> it seems 1071082519 is a special number.
> >>> >> >> >> >>
> >>> >> >> >> >> - -
> >>> >> >> >> >>
> >>> >> >> >> >> Ziming Dong
> >>> >> >> >> >> *http://suiyuan2009.github.io/
> >>> >> >> >> >> <http://suiyuan2009.github.io/>*
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > --
> >>> >> >> >
> >>> >> >> > Ziming Dong
> >>> >> >> > http://suiyuan2009.github.io/
> >>> >> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > --
> >>> >> >
> >>> >> > Ziming Dong
> >>> >> > http://suiyuan2009.github.io/
> >>> >> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> >
> >>> > Ziming Dong
> >>> > http://suiyuan2009.github.io/
> >>> >
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Ziming Dong
> >> http://suiyuan2009.github.io/
> >>
> >
> >
> >
> > --
> >
> > Ziming Dong
> > http://suiyuan2009.github.io/
> >
>



-- 

Ziming Dong
*http://suiyuan2009.github.io/ <http://suiyuan2009.github.io/>*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message