lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chrisstolte <stolte...@gmail.com>
Subject Solr concurrent merge issue
Date Mon, 19 Sep 2011 20:02:32 GMT
Hello,

I am part of a team that is developing a Solr-backed search engine, and have
run into some difficulty related to merging.  We use high speed solid state
drives (SLC) with very fast write speeds, and lately have seen the server
become corrupt, seemingly for no external reason, with stack traces that
look like this:

Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.NullPointerException
       at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
       at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
Caused by: java.lang.NullPointerException
       at org.apache.lucene.util.StringHelper.intern(StringHelper.java:36)
       at
org.apache.lucene.index.FieldsReader$FieldForMerge.<init>(FieldsReader.java:647)
       at
org.apache.lucene.index.FieldsReader.addFieldForMerge(FieldsReader.java:357)
       at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:232)
       at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:970)
       at
org.apache.lucene.index.SegmentMerger.copyFieldsNoDeletions(SegmentMerger.java:450)
       at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:352)
       at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:153)
       at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5112)
       at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4675)
       at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
       at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
java.lang.NullPointerException
       at
org.apache.solr.core.SolrDeletionPolicy.onCommit(SolrDeletionPolicy.java:122)
       at
org.apache.solr.core.IndexDeletionPolicyWrapper.onCommit(IndexDeletionPolicyWrapper.java:137)
       at
org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:401)
       at
org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:4228)
       at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4144)
       at
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2263)
       at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2207)
       at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2171)
       at
org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:230)
       at
org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:181)
       at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:409)
       at
org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run(DirectUpdateHandler2.java:602)
       at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
       at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
       at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:679)
java.lang.RuntimeException: java.lang.RuntimeException: cannot load
SegmentReader class: java.lang.NullPointerException
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
       at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
       at
org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run(DirectUpdateHandler2.java:602)
       at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
       at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
       at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.RuntimeException: cannot load SegmentReader class:
java.lang.NullPointerException
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:643)
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:613)
       at
org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:228)
       at
org.apache.lucene.index.ReadOnlyDirectoryReader.<init>(ReadOnlyDirectoryReader.java:32)
       at
org.apache.lucene.index.DirectoryReader.doReopen(DirectoryReader.java:440)
       at
org.apache.lucene.index.DirectoryReader.access$000(DirectoryReader.java:43)
       at
org.apache.lucene.index.DirectoryReader$2.doBody(DirectoryReader.java:432)
       at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683)
       at
org.apache.lucene.index.DirectoryReader.doReopenNoWriter(DirectoryReader.java:428)
       at
org.apache.lucene.index.DirectoryReader.doReopen(DirectoryReader.java:386)
       at
org.apache.lucene.index.DirectoryReader.reopen(DirectoryReader.java:352)
       at
org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:413)
       at
org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:424)
       at
org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:35)
       at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1049)
       ... 10 more
Caused by: java.lang.NullPointerException
       at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:639)
       ... 24 more

After this happens, the index is left in a corrupt state and the server
complains about a missing index file.  Restarting doesn't help, and we are
forced to blow away the data and start over.

Our server architecture is such that several JVMs (which are multi-threaded)
are all pretty much constantly sending updates to the master, and there are
a handful of slaves replicating from that master.  Each shard is roughly
100-300GB in size, although the master had only roughly 15GB when the
corruption happened.  We used this architecture without issues for months on
slower, MLC solid state drives, and are therefore somewhat concerned that
the faster drives may be exposing an undiscovered bug in the merging code.

That is really just a guess though - does anyone out there have experience
with using SSDs in conjunction with Lucene/Solr?
Or have a suggestion about what might be going on here?  We've seen this
behavior more than once, seemingly spontaneously after days of working
correctly.
What causes the above code to run?  We think it's triggered when segments
are merged, but aren't sure.

We are using Solr 1.4.1 on Ubuntu servers with OpenJDK (1.6.0_22).

Many thanks for any thoughts or tips.

Best regards,
Chris


--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-concurrent-merge-issue-tp3349886p3349886.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Mime
View raw message