lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: CheckIndex possibly not detecting/fixing all corruptions?
Date Thu, 14 Aug 2008 10:15:04 GMT
OK thanks for bringing closure to this John, and good luck tracking it down.

Mike

John O'Brien <john.obrien@criticalpath.net> wrote:
> Hi Mike,
>        Apologies for the delay in getting back.
> I have since figured out that the reason Luke gave an error when we searched
> on the "fixed" index was (possibly) because it was a really old version (0.6
> 2005/02/16) - I tried again with v 0.8.1 (2008-02-13) and Luke can search on
> the "fixed" index just fine.
>
> When I first ran CheckIndex the exception was as follows:
>
> java.lang.RuntimeException: term body:03: doc -2147483645 < lastDoc 44
>        at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:195)
>        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:373)
>
> WARNING: 1 broken segments detected
> WARNING: 162 documents will be lost
>
> So for now I think the problem may lie in CLucene (or possibly our use of
> it) rather than CheckIndex not detecting and repairing properly.
> I will follow up with an update if/when I figure it out!
> In the meantime if you have any ideas as to how/why a corruption of this
> nature can occur I'd be glad to hear them.
>
> Thanks for your help,
> John.
>
> -----Original Message-----
> From Michael McCandless <luc...@mikemccandless.com>
> Subject Re: CheckIndex possibly not detecting/fixing all corruptions?
> Date Wed, 30 Jul 2008 10:43:29 GMT
>
> Do you have the exception Luke produced?   That'd be a good clue as to
> what CheckIndex is not detecting.  It's hard for me to tell from that
> GDB trace exactly what's gone wrong...
>
> When you first ran CheckIndex, and it detected one corrupt segment,
> what exception did it report as the cause of the corruption on that
> segment?
>
> If you could upload the index somewhere, that'd be great -- I'll try
> to figure out why CheckIndex fails to detect the corruption.  It's
> particularly odd that you were able to successfully optimize the index.
>
> If you run searches on your index use Lucene java, do you get any odd
> exceptions?
>
> Mike
>
>
> -----Original Message-----
> From: John O'Brien [mailto:john.obrien@criticalpath.net]
> Sent: 30 July 2008 11:22
> To: 'java-user@lucene.apache.org'
> Subject: CheckIndex possibly not detecting/fixing all corruptions?
>
>
> Hi,
>        I already posted this question on the CLucene dev list but it was
> suggested that I may be able to get some help on the Java list so here goes.
> We use Clucene 0.9.20 in our search engine. One of the indexes appears to
> have become corrupt (still investigating the cause of the corruption). We
> tried using the Java Lucene CheckIndex tool to fix the corruption(s).
> CheckIndex detected 1 broken segment and removed 162 documents and wrote a
> new segments file. Subsequent runs of CheckIndex report "No problems were
> detected with this index.". However our search engine still crashes when its
> performing a search on the "fixed" index. Here is the relevant backtrace:
>
> #8  0x001c0faa in lucene::index::SegmentTermDocs::read (this=0x664f26c8,
>    docs=0x664f9944, freqs=0x664f99c4, length=32)
>    at CLucene/util/BitSet.h:35
> #9  0x001b4a4a in lucene::index::MultiTermDocs::read (this=0x664e5d38,
>    docs=0x664f9944, freqs=0x664f99c4, length=32)
>    at ../src/CLucene/index/MultiReader.cpp:400
> #10 0x001ef7dc in lucene::search::TermScorer::next (this=0x664f9920)
>    at ../src/CLucene/search/TermScorer.cpp:41
> #11 0x001d2d23 in lucene::search::BooleanScorer::next (this=0x665f2fc0)
>    at ../src/CLucene/search/BooleanScorer.cpp:63
> #12 0x001d2d23 in lucene::search::BooleanScorer::next (this=0x664e5ec8)
>    at ../src/CLucene/search/BooleanScorer.cpp:63
> #13 0x001e3275 in lucene::search::IndexSearcher::_search (
>    this=0x65ba6410, query=0x66734ed8, filter=0x0, nDocs=100)
>    at ../src/CLucene/search/Scorer.h:39
> #14 0x001e1fb0 in lucene::search::Hits::getMoreDocs (this=0x66881440,
>    m=50) at ../src/CLucene/search/Hits.cpp:110
> #15 0x001e1acf in lucene::search::Hits::Hits ()
>    at CLucene/util/PriorityQueue.h:35
>
> We tried to perform a search using Luke tool but this also resulted in an
> error.
> Also tried after optimizing the index db, but the same error persists.
> So it looks like the index db might still be corrupt.
>
> Any ideas as to why CheckIndex appears not to have detected/fixed all
> corruptions?
> Are there any other suggestions as to how to detect/repair index
> corruptions?
>
> Thanks in advance,
> John.
>
> P.s. I still have the index in question before and after running CheckIndex
> fix on it so if anyone's willing to take a look at it I can upload it
> somewhere for you to download.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message