lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: CheckIndex possibly not detecting/fixing all corruptions?
Date Wed, 30 Jul 2008 10:43:29 GMT

Do you have the exception Luke produced?   That'd be a good clue as to  
what CheckIndex is not detecting.  It's hard for me to tell from that  
GDB trace exactly what's gone wrong...

When you first ran CheckIndex, and it detected one corrupt segment,  
what exception did it report as the cause of the corruption on that  
segment?

If you could upload the index somewhere, that'd be great -- I'll try  
to figure out why CheckIndex fails to detect the corruption.  It's  
particularly odd that you were able to successfully optimize the index.

If you run searches on your index use Lucene java, do you get any odd  
exceptions?

Mike

John O'Brien wrote:

> Hi,
> 	I already posted this question on the CLucene dev list but it was
> suggested that I may be able to get some help on the Java list so  
> here goes.
> We use Clucene 0.9.20 in our search engine. One of the indexes  
> appears to
> have become corrupt (still investigating the cause of the  
> corruption). We
> tried using the Java Lucene CheckIndex tool to fix the corruption(s).
> CheckIndex detected 1 broken segment and removed 162 documents and  
> wrote a
> new segments file. Subsequent runs of CheckIndex report "No problems  
> were
> detected with this index.". However our search engine still crashes  
> when its
> performing a search on the "fixed" index. Here is the relevant  
> backtrace:
>
> #8  0x001c0faa in lucene::index::SegmentTermDocs::read  
> (this=0x664f26c8,
>    docs=0x664f9944, freqs=0x664f99c4, length=32)
>    at CLucene/util/BitSet.h:35
> #9  0x001b4a4a in lucene::index::MultiTermDocs::read (this=0x664e5d38,
>    docs=0x664f9944, freqs=0x664f99c4, length=32)
>    at ../src/CLucene/index/MultiReader.cpp:400
> #10 0x001ef7dc in lucene::search::TermScorer::next (this=0x664f9920)
>    at ../src/CLucene/search/TermScorer.cpp:41
> #11 0x001d2d23 in lucene::search::BooleanScorer::next  
> (this=0x665f2fc0)
>    at ../src/CLucene/search/BooleanScorer.cpp:63
> #12 0x001d2d23 in lucene::search::BooleanScorer::next  
> (this=0x664e5ec8)
>    at ../src/CLucene/search/BooleanScorer.cpp:63
> #13 0x001e3275 in lucene::search::IndexSearcher::_search (
>    this=0x65ba6410, query=0x66734ed8, filter=0x0, nDocs=100)
>    at ../src/CLucene/search/Scorer.h:39
> #14 0x001e1fb0 in lucene::search::Hits::getMoreDocs (this=0x66881440,
>    m=50) at ../src/CLucene/search/Hits.cpp:110
> #15 0x001e1acf in lucene::search::Hits::Hits ()
>    at CLucene/util/PriorityQueue.h:35
>
> We tried to perform a search using Luke tool but this also resulted  
> in an
> error.
> Also tried after optimizing the index db, but the same error persists.
> So it looks like the index db might still be corrupt.
>
> Any ideas as to why CheckIndex appears not to have detected/fixed all
> corruptions?
> Are there any other suggestions as to how to detect/repair index
> corruptions?
>
> Thanks in advance,
> John.
>
> P.s. I still have the index in question before and after running  
> CheckIndex
> fix on it so if anyone's willing to take a look at it I can upload it
> somewhere for you to download.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message