lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darren Lee <d...@amplience.com>
Subject SolrCloud replica dies under high throughput
Date Mon, 21 Jul 2014 23:14:25 GMT
Hi,

I'm doing some benchmarking with Solr Cloud 4.9.0. I am trying to work out exactly how much
throughput my cluster can handle.

Consistently in my test I see a replica go into recovering state forever caused by what looks
like a timeout during replication. I can understand the timeout and failure (I am hitting
it fairly hard) but what seems odd to me is that when I stop the heavy load it still does
not recover the next time it tries, it seems broken forever until I manually go in, clear
the index and let it do a full resync.

Is this normal? Am I misunderstanding something? My cluster has 4 nodes (2 shards, 2 replicas)
(AWS m3.2xlarge). I am indexing with ~800 concurrent connections and a 10 sec soft commit.
I consistently get this problem with a throughput of around 1.5 million documents per hour.

Thanks all,
Darren


Stack Traces & Messages:

[qtp779330563-627] ERROR org.apache.solr.servlet.SolrDispatchFilter  â null:org.apache.http.conn.ConnectionPoolTimeoutException:
Timeout waiting for connection from pool
        at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
        at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

Error while trying to recover. core=assets_shard2_replica1:java.util.concurrent.ExecutionException:
org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server
at: http://xxx.xxx.15.171:8080/solr
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:188)
        at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:615)
        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:371)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Caused by: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking
to server at: http://xxx.xxx.15.171:8080/solr
        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566)
        at org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:245)
        at org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:241)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.SocketException: Socket closed
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
        at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
        at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:452)
        ... 6 more

853915 [RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy  â Recovery failed -
trying again... (0) core=assets_shard2_replica1
853915 [RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy  â Recovery failed -
interrupted. core=assets_shard2_replica1
853915 [RecoveryThread] ERROR org.apache.solr.cloud.RecoveryStrategy  â Recovery failed -
I give up. core=assets_shard2_replica1
853918 [RecoveryThread] WARN  org.apache.solr.cloud.RecoveryStrategy  â Stopping recovery
for zkNodeName=xxx.xxx.15.174:8080_solr_assets_shard2_replica1core=assets_shard2_replica1
853933 [Thread-382] WARN  org.apache.solr.cloud.RecoveryStrategy  â Stopping recovery for
zkNodeName=xxx.xxx.15.174:8080_solr_assets_shard2_replica1core=assets_shard2_replica1

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message