lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Smith (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1282) Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene
Date Sun, 11 May 2008 22:41:55 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595946#action_12595946
] 

Paul Smith commented on LUCENE-1282:
------------------------------------

Another workaround might be to use '-client' instead of the default '-server' (for server
class machines).  This affects a few things, not least this switch:

-XX:CompileThreshold=10000 	Number of method invocations/branches before compiling [-client:
1,500]

-server implies a 10000 value.  I have personally observed similar behaviour like problems
like the above with -server, and usually -client ends up 'solving' them.

I'm sure there was also a way to mark a method to not jit compile too (rather than resort
to -Xint which disables i for everything), but now I cant' find what that syntax is at all.

> Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene
> ------------------------------------------------------
>
>                 Key: LUCENE-1282
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1282
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.3, 2.3.1
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.4
>
>
> This is not a Lucene bug.  It's an as-yet not fully characterized Sun
> JRE bug, as best I can tell.  I'm opening this to gather all things we
> know, and to work around it in Lucene if possible, and maybe open an
> issue with Sun if we can reduce it to a compact test case.
> It's hit at least 3 users:
>   http://mail-archives.apache.org/mod_mbox/lucene-java-user/200803.mbox/%3c8c4e68610803180438x39737565q9f97b4802ed774a5@mail.gmail.com%3e
>   http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200804.mbox/%3c4807654E.7050900@virginia.edu%3e
>   http://mail-archives.apache.org/mod_mbox/lucene-java-user/200805.mbox/%3c733777220805060156t7fdb8fectf0bc984fbfe48a22@mail.gmail.com%3e
> It's specific to at least JRE 1.6.0_04 and 1.6.0_05, that affects
> Lucene.  Whereas 1.6.0_03 works OK and it's unknown whether 1.6.0_06
> shows it.
> The bug affects bulk merging of stored fields.  When it strikes, the
> segment produced by a merge is corrupt because its fdx file (stored
> fields index file) is missing one document.  After iterating many
> times with the first user that hit this, adding diagnostics &
> assertions, its seems that a call to fieldsWriter.addDocument some
> either fails to run entirely, or, fails to invoke its call to
> indexStream.writeLong.  It's as if when hotspot compiles a method,
> there's some sort of race condition in cutting over to the compiled
> code whereby a single method call fails to be invoked (speculation).
> Unfortunately, this corruption is silent when it occurs and only later
> detected when a merge tries to merge the bad segment, or an
> IndexReader tries to open it.  Here's a typical merge exception:
> {code}
> Exception in thread "Thread-10" 
> org.apache.lucene.index.MergePolicy$MergeException: 
> org.apache.lucene.index.CorruptIndexException:
>     doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo shows
16000
>         at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
> Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
_3gh: fieldsReader shows 15999 but segmentInfo shows 16000
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:221)
>         at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3099)
>         at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
>         at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
> {code}
> and here's a typical exception hit when opening a searcher:
> {code}
> org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _kk: fieldsReader
shows 72670 but segmentInfo shows 72671
>         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
>         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:230)
>         at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:73)
>         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636)
>         at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
>         at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
>         at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:48)
> {code}
> Sometimes, adding -Xbatch (forces up front compilation) or -Xint
> (disables compilation) to the java command line works around the
> issue.
> Here are some of the OS's we've seen the failure on:
> {code}
> SuSE 10.0
> Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005 x86_64 
> x86_64 x86_64 GNU/Linux 
> SuSE 8.2
> Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003 i686 
> unknown unknown GNU/Linux 
> Red Hat Enterprise Linux Server release 5.1 (Tikanga)
> Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SMP Tue Feb 19 
> 07:18:21 EST 2008 i686 i686 i386 GNU/Linux
> {code}
> I've already added assertions to Lucene to detect when this bug
> strikes, but since assertions are not usually enabled, I plan to add a
> real check to catch when this bug strikes *before* we commit the merge
> to the index.  This way we can detect & quarantine the failure and
> prevent corruption from entering the index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message