lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bogdan Ghidireac" <bog...@ecstend.com>
Subject Re: CheckIndex tool issues
Date Wed, 28 Nov 2007 13:50:41 GMT
Great, everything runs fine now.. Thank you.

Bogdan

On 11/27/07, Michael McCandless <lucene@mikemccandless.com> wrote:
>
>
> OK I opened this JIRA issue to track this:
>
>   https://issues.apache.org/jira/browse/LUCENE-1069
>
> Mike
>
> "Michael McCandless" <lucene@mikemccandless.com> wrote:
> >
> > Woops!  You are right, this is a silly bug in the CheckIndex tool.  It
> is not
> > properly taking into account deletions.  I will open an issue & fix it.
> >
> > Thanks for testing & reporting this, and sorry about that.
> >
> > Mike
> >
> > "Bogdan Ghidireac" <bogdan@ecstend.com> wrote:
> > > Hi,
> > >
> > > I tried to use the CheckIndex tool (the latest svn code) and I was
> > > surprised
> > > to notice that all my indexes from production (around 30) are corrupt.
> > > This
> > > is highly unlikely because they were running for about one year and I
> had
> > > no
> > > exception during search so far.
> > >
> > > One recurring pattern I observed is that the tool reports the segments
> > > with
> > > deleted docs as corrupt. The one without deleted docs are fine.. Here
> is
> > > a
> > > sample output.
> > >
> > > index 1
> > >
> > >   6 of 7: name=_wxlp docCount=1001
> > >     compound=true
> > >     numFiles=1
> > >     size (MB)=0.213
> > >     no deletions
> > >     test: open reader.........OK
> > >     test: fields, norms.......OK [12 fields]
> > >     test: terms, freq, prox...OK [4142 terms; 8004 terms/docs pairs;
> 8006
> > > tokens]
> > >     test: stored fields.......OK [12012 total field count; avg 12
> fields
> > >     per
> > > doc]
> > >     test: term vectors........OK [0 total vector count; avg 0
> term/freq
> > > vector fields per doc]
> > >
> > >   7 of 7: name=_wxqg docCount=178
> > >     compound=true
> > >     numFiles=1
> > >     size (MB)=0.039
> > >     no deletions
> > >     test: open reader.........OK
> > >     test: fields, norms.......OK [12 fields]
> > >     test: terms, freq, prox...OK [819 terms; 1417 terms/docs pairs;
> 1417
> > > tokens]
> > >     test: stored fields.......OK [2136 total field count; avg 12
> fields
> > >     per
> > > doc]
> > >     test: term vectors........OK [0 total vector count; avg 0
> term/freq
> > > vector fields per doc]
> > >
> > > index 2
> > >
> > >   6 of 7: name=_10hr docCount=1978
> > >     compound=true
> > >     numFiles=2
> > >     size (MB)=3.601
> > >     has deletions [delFileName=_10hr_5.del]
> > >     test: open reader.........OK [17 deleted docs]
> > >     test: fields, norms.......OK [10 fields]
> > >     test: terms, freq, prox...FAILED
> > >     WARNING: would remove reference to this segment (-fix was not
> > > specified); full exception:
> > > java.lang.RuntimeException: term ASIN:342678033X docFreq=5 != num docs
> > > seen
> > > 4
> > >         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java
> :217)
> > >
> > >   7 of 7: name=_10i0 docCount=196
> > >     compound=true
> > >     numFiles=1
> > >     size (MB)=0.44
> > >     no deletions
> > >     test: open reader.........OK
> > >     test: fields, norms.......OK [10 fields]
> > >     test: terms, freq, prox...OK [8611 terms; 24307 terms/docs pairs;
> > >     32841
> > > tokens]
> > >     test: stored fields.......OK [1960 total field count; avg 10
> fields
> > >     per
> > > doc]
> > >     test: term vectors........OK [0 total vector count; avg 0
> term/freq
> > > vector fields per doc]
> > >
> > >
> > > Is this a known issue or my indexes are really corrupt ?
> > >
> > > Regards,
> > > Bogdan
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message