lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Osborn <kevin.osb...@cbsi.com>
Subject Re: Solr Cloud hangs when replicating updates
Date Sat, 07 Sep 2013 00:21:03 GMT
Thanks a ton Mark. I have tried SOLR-4816 and it didn't help. But I will
try Mark's patch next week, and see what happens.

-Kevin


On Thu, Sep 5, 2013 at 4:46 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> If you run into this again, try a jstack trace. You should see
> evidence of being stuck in SolrCmdDistributor on a variable
> called "semaphore"... On current 4x this is around line 420.
>
> If you're using SolrJ, then SOLR-4816 is another thing to try.
>
> But Mark's patch would be best of all to test, If that doesn't
> fix it then the jstack suggestion would at least tell us if it's
> the issue we think it is.
>
> FWIW,
> Erick
>
>
> On Wed, Sep 4, 2013 at 12:51 PM, Mark Miller <markrmiller@gmail.com>
> wrote:
>
> > It would be great if you could give this patch a try:
> > http://pastebin.com/raw.php?i=aaRWwSGP
> >
> > - Mark
> >
> >
> > On Wed, Sep 4, 2013 at 8:31 AM, Kevin Osborn <kevin.osborn@cbsi.com>
> > wrote:
> >
> > > Thanks. If there is anything I can do to help you resolve this issue,
> let
> > > me know.
> > >
> > > -Kevin
> > >
> > >
> > > On Wed, Sep 4, 2013 at 7:51 AM, Mark Miller <markrmiller@gmail.com>
> > wrote:
> > >
> > > > Ill look at fixing the root issue for 4.5. I've been putting it off
> for
> > > > way to long.
> > > >
> > > > Mark
> > > >
> > > > Sent from my iPhone
> > > >
> > > > On Sep 3, 2013, at 2:15 PM, Kevin Osborn <kevin.osborn@cbsi.com>
> > wrote:
> > > >
> > > > > I was having problems updating SolrCloud with a large batch of
> > records.
> > > > The
> > > > > records are coming in bursts with lulls between updates.
> > > > >
> > > > > At first, I just tried large updates of 100,000 records at a time.
> > > > > Eventually, this caused Solr to hang. When hung, I can still query
> > > Solr.
> > > > > But I cannot do any deletes or other updates to the index.
> > > > >
> > > > > At first, my updates were going as SolrJ CSV posts. I have also
> tried
> > > > local
> > > > > file updates and had similar results. I finally slowed things down
> to
> > > > just
> > > > > use SolrJ's Update feature, which is basically just JavaBin. I am
> > also
> > > > > sending over just 100 at a time in 10 threads. Again, it eventually
> > > hung.
> > > > >
> > > > > Sometimes, Solr hangs in the first couple of chunks. Other times,
> it
> > > > hangs
> > > > > right away.
> > > > >
> > > > > These are my commit settings:
> > > > >
> > > > > <autoCommit>
> > > > >       <maxTime>15000</maxTime>
> > > > >       <maxDocs>5000</maxDocs>
> > > > >       <openSearcher>false</openSearcher>
> > > > >     </autoCommit>
> > > > > <autoSoftCommit>
> > > > >         <maxTime>30000</maxTime>
> > > > >       </autoSoftCommit>
> > > > >
> > > > > I have tried quite a few variations with the same results. I also
> > tried
> > > > > various JVM settings with the same results. The only variable seems
> > to
> > > be
> > > > > that reducing the cluster size from 2 to 1 is the only thing that
> > > helps.
> > > > >
> > > > > I also did a jstack trace. I did not see any explicit deadlocks,
> but
> > I
> > > > did
> > > > > see quite a few threads in WAITING or TIMED_WAITING. It is
> typically
> > > > > something like this:
> > > > >
> > > > >  java.lang.Thread.State: WAITING (parking)
> > > > >        at sun.misc.Unsafe.park(Native Method)
> > > > >        - parking to wait for  <0x000000074039a450> (a
> > > > > java.util.concurrent.Semaphore$NonfairSync)
> > > > >        at
> > > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> > > > >        at
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> > > > >        at
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
> > > > >        at
> > > > >
> > > >
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
> > > > >        at
> java.util.concurrent.Semaphore.acquire(Semaphore.java:317)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.util.AdjustableSemaphore.acquire(AdjustableSemaphore.java:61)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:418)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:368)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.update.SolrCmdDistributor.flushAdds(SolrCmdDistributor.java:300)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.update.SolrCmdDistributor.distribAdd(SolrCmdDistributor.java:139)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:474)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.loader.CSVLoaderBase.doAdd(CSVLoaderBase.java:395)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.loader.SingleThreadedCSVLoader.addDoc(CSVLoader.java:44)
> > > > >        at
> > > > >
> > >
> org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:364)
> > > > >        at
> > > > org.apache.solr.handler.loader.CSVLoader.load(CSVLoader.java:31)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > > > >        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > > > >        at
> > > > >
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > > > >
> > > > > It basically appears that Solr gets stuck while trying to acquire
a
> > > > > semaphore that never becomes available.
> > > > >
> > > > > Anyone have any ideas? This is definitely causing major problems
> for
> > > us.
> > > > >
> > > > > --
> > > > > *KEVIN OSBORN*
> > > > > LEAD SOFTWARE ENGINEER
> > > > > CNET Content Solutions
> > > > > OFFICE 949.399.8714
> > > > > CELL 949.310.4677      SKYPE osbornk
> > > > > 5 Park Plaza, Suite 600, Irvine, CA 92614
> > > > > [image: CNET Content Solutions]
> > > >
> > >
> > >
> > >
> > > --
> > > *KEVIN OSBORN*
> > > LEAD SOFTWARE ENGINEER
> > > CNET Content Solutions
> > > OFFICE 949.399.8714
> > > CELL 949.310.4677      SKYPE osbornk
> > > 5 Park Plaza, Suite 600, Irvine, CA 92614
> > > [image: CNET Content Solutions]
> > >
> >
> >
> >
> > --
> > - Mark
> >
>



-- 
*KEVIN OSBORN*
LEAD SOFTWARE ENGINEER
CNET Content Solutions
OFFICE 949.399.8714
CELL 949.310.4677      SKYPE osbornk
5 Park Plaza, Suite 600, Irvine, CA 92614
[image: CNET Content Solutions]

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message