lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-1282) Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene
Date Mon, 12 May 2008 00:18:52 GMT
>>From what I read -Xint slows you down so much its not much of a
workaround.

Here's a couple examples of that exclude method syntax (had to use it
recently with eclipse):
-XX:CompileCommand=exclude,org/apache/lucene/index/IndexReader\
$1,doBody 
-XX:CompileCommand=exclude,org/eclipse/core/internal/dtree/DataTreeNode,forwardDeltaWith

On Sun, 2008-05-11 at 15:41 -0700, Paul Smith (JIRA) wrote:
> [ https://issues.apache.org/jira/browse/LUCENE-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595946#action_12595946
] 
> 
> Paul Smith commented on LUCENE-1282:
> ------------------------------------
> 
> Another workaround might be to use '-client' instead of the default '-server' (for server
class machines).  This affects a few things, not least this switch:
> 
> -XX:CompileThreshold=10000 	Number of method invocations/branches before compiling [-client:
1,500]
> 
> -server implies a 10000 value.  I have personally observed similar behaviour like problems
like the above with -server, and usually -client ends up 'solving' them.
> 
> I'm sure there was also a way to mark a method to not jit compile too (rather than resort
to -Xint which disables i for everything), but now I cant' find what that syntax is at all.
> 
> > Sun hotspot compiler bug in 1.6.0_04/05 affects Lucene
> > ------------------------------------------------------
> >
> >                 Key: LUCENE-1282
> >                 URL: https://issues.apache.org/jira/browse/LUCENE-1282
> >             Project: Lucene - Java
> >          Issue Type: Bug
> >          Components: Index
> >    Affects Versions: 2.3, 2.3.1
> >            Reporter: Michael McCandless
> >            Assignee: Michael McCandless
> >            Priority: Minor
> >             Fix For: 2.4
> >
> >
> > This is not a Lucene bug.  It's an as-yet not fully characterized Sun
> > JRE bug, as best I can tell.  I'm opening this to gather all things we
> > know, and to work around it in Lucene if possible, and maybe open an
> > issue with Sun if we can reduce it to a compact test case.
> > It's hit at least 3 users:
> >   http://mail-archives.apache.org/mod_mbox/lucene-java-user/200803.mbox/%3c8c4e68610803180438x39737565q9f97b4802ed774a5@mail.gmail.com%3e
> >   http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200804.mbox/%3c4807654E.7050900@virginia.edu%3e
> >   http://mail-archives.apache.org/mod_mbox/lucene-java-user/200805.mbox/%3c733777220805060156t7fdb8fectf0bc984fbfe48a22@mail.gmail.com%3e
> > It's specific to at least JRE 1.6.0_04 and 1.6.0_05, that affects
> > Lucene.  Whereas 1.6.0_03 works OK and it's unknown whether 1.6.0_06
> > shows it.
> > The bug affects bulk merging of stored fields.  When it strikes, the
> > segment produced by a merge is corrupt because its fdx file (stored
> > fields index file) is missing one document.  After iterating many
> > times with the first user that hit this, adding diagnostics &
> > assertions, its seems that a call to fieldsWriter.addDocument some
> > either fails to run entirely, or, fails to invoke its call to
> > indexStream.writeLong.  It's as if when hotspot compiles a method,
> > there's some sort of race condition in cutting over to the compiled
> > code whereby a single method call fails to be invoked (speculation).
> > Unfortunately, this corruption is silent when it occurs and only later
> > detected when a merge tries to merge the bad segment, or an
> > IndexReader tries to open it.  Here's a typical merge exception:
> > {code}
> > Exception in thread "Thread-10" 
> > org.apache.lucene.index.MergePolicy$MergeException: 
> > org.apache.lucene.index.CorruptIndexException:
> >     doc counts differ for segment _3gh: fieldsReader shows 15999 but segmentInfo
shows 16000
> >         at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
> > Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for
segment _3gh: fieldsReader shows 15999 but segmentInfo shows 16000
> >         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
> >         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
> >         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:221)
> >         at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3099)
> >         at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
> >         at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
> > {code}
> > and here's a typical exception hit when opening a searcher:
> > {code}
> > org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _kk:
fieldsReader shows 72670 but segmentInfo shows 72671
> >         at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313)
> >         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
> >         at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:230)
> >         at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:73)
> >         at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636)
> >         at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
> >         at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
> >         at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:48)
> > {code}
> > Sometimes, adding -Xbatch (forces up front compilation) or -Xint
> > (disables compilation) to the java command line works around the
> > issue.
> > Here are some of the OS's we've seen the failure on:
> > {code}
> > SuSE 10.0
> > Linux phoebe 2.6.13-15-smp #1 SMP Tue Sep 13 14:56:15 UTC 2005 x86_64 
> > x86_64 x86_64 GNU/Linux 
> > SuSE 8.2
> > Linux phobos 2.4.20-64GB-SMP #1 SMP Mon Mar 17 17:56:03 UTC 2003 i686 
> > unknown unknown GNU/Linux 
> > Red Hat Enterprise Linux Server release 5.1 (Tikanga)
> > Linux lab8.betech.virginia.edu 2.6.18-53.1.14.el5 #1 SMP Tue Feb 19 
> > 07:18:21 EST 2008 i686 i686 i386 GNU/Linux
> > {code}
> > I've already added assertions to Lucene to detect when this bug
> > strikes, but since assertions are not usually enabled, I plan to add a
> > real check to catch when this bug strikes *before* we commit the merge
> > to the index.  This way we can detect & quarantine the failure and
> > prevent corruption from entering the index.
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message