Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4173510445 for ; Mon, 3 Feb 2014 23:38:07 +0000 (UTC) Received: (qmail 80590 invoked by uid 500); 3 Feb 2014 23:38:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 80525 invoked by uid 500); 3 Feb 2014 23:38:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 80517 invoked by uid 99); 3 Feb 2014 23:38:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Feb 2014 23:38:04 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Athinanthny.X.Senthil.-ND@disney.com designates 204.128.192.36 as permitted sender) Received: from [204.128.192.36] (HELO msg2.disney.com) (204.128.192.36) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Feb 2014 23:37:55 +0000 Received: from int1.disney.pvt (int1.disney.pvt [153.7.110.9]) by msg2.disney.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id s13NbXma011722 for ; Mon, 3 Feb 2014 23:37:33 GMT Received: from sm-cala-xht03.swna.wdpr.disney.com (SM-CALA-XHT03.swna.wdpr.disney.com [153.7.248.18]) by int1.disney.pvt (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id s13NbLPO010123 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL) for ; Mon, 3 Feb 2014 23:37:33 GMT Received: from sm-cala-vxmb06b.swna.wdpr.disney.com ([::1]) by sm-cala-xht03.swna.wdpr.disney.com ([::1]) with mapi; Mon, 3 Feb 2014 15:37:26 -0800 From: "Senthil, Athinanthny X. -ND" To: "user@cassandra.apache.org" Date: Mon, 3 Feb 2014 15:37:23 -0800 Subject: socket timeout errors in one DC in a multi DC cluster Thread-Topic: socket timeout errors in one DC in a multi DC cluster Thread-Index: Ac8hNjnv7a2GacRMTjOwAQ4GVjLUzA== Message-ID: <97FB38514D40D84A994DD3959447966839970DE4D5@SM-CALA-VXMB06B.swna.wdpr.disney.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_97FB38514D40D84A994DD3959447966839970DE4D5SMCALAVXMB06B_" MIME-Version: 1.0 X-Flow-Control: Sendmail Flow Controller v2.2.5 int1.disney.pvt s13NbLPO010123 X-Flow-Control-Info: class=Exchange rcpts=1 size=7223 X-Virus-Checked: Checked by ClamAV on apache.org --_000_97FB38514D40D84A994DD3959447966839970DE4D5SMCALAVXMB06B_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Experiencing socket timeout errors in one DC in most of the nodes in multi= dc cluster. Here is error. Client is having intermittent high response tim= e issues in this DC. DC1 does not experience any timeout issues, but DC= 2 does though. This error started occurring recently and repeats consecuti= ve days. Any suggestions on cause for it? During this situation, when we try to run queries on CQL3 on local server i= tself, sometime we get rpc_timeout errors. But this is intermittent as well= . ERROR [Thrift:97] CustomTThreadPoolServer.java (line 219) Error occurred = during processing of message. com.google.common.util.concurrent.UncheckedExecutionException: java.lang.Ru= ntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Opera= tion timed out - received only 2 responses. at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258) at com.google.common.cache.LocalCache.get(LocalCache.java:3990) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java= :4878) at org.apache.cassandra.service.ClientState.authorize(ClientState.java:308) at org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState= .java:178) at org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:171) at org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientSta= te.java:155) at org.apache.cassandra.thrift.CassandraServer.createMutationList(Cassandra= Server.java:681) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer= .java:749) at com.datastax.bdp.server.DseServer.batch_mutate(DseServer.java:931) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(C= assandra.java:3622) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(C= assandra.java:3610) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cu= stomTThreadPoolServer.java:201) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) --_000_97FB38514D40D84A994DD3959447966839970DE4D5SMCALAVXMB06B_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

 

Experiencing socket timeout errors in one  D= C in most of the nodes in multi dc cluster. Here is error. Client is having= intermittent high response time issues in this DC.     = ;DC1 does not experience any timeout issues, but DC2 does though.  Thi= s error started occurring recently and repeats consecutive days. Any sugges= tions on cause for it?

 =

During this situation, when we try to run queries = on CQL3 on local server itself, sometime we get rpc_timeout errors. But thi= s is intermittent as well.

 

 

ERROR [Thrift:97]   CustomTThreadPoolServer.java (line 219)= Error occurred during processing of message.
com.google.common.util.co= ncurrent.UncheckedExecutionException: java.lang.RuntimeException: org.apach= e.cassandra.exceptions.ReadTimeoutException: Operation timed out - received= only 2 responses.

at com.google.common.cache.LocalCache$Segment.ge= t(LocalCache.java:2258)
at com.google.common.cache.LocalCache.get(Local= Cache.java:3990)
at com.google.common.cache.LocalCache.getOrLoad(LocalC= ache.java:3994)
at com.google.common.cache.LocalCache$LocalLoadingCache= .get(LocalCache.java:4878)
at org.apache.cassandra.service.ClientState.= authorize(ClientState.java:308)
at org.apache.cassandra.service.ClientS= tate.ensureHasPermission(ClientState.java:178)
at org.apache.cassandra.= service.ClientState.hasAccess(ClientState.java:171)
at org.apache.cassa= ndra.service.ClientState.hasColumnFamilyAccess(ClientState.java:155)
at= org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraSe= rver.java:681)
at org.apache.cassandra.thrift.CassandraServer.batch_mut= ate(CassandraServer.java:749)
at com.datastax.bdp.server.DseServer.batc= h_mutate(DseServer.java:931)
at org.apache.cassandra.thrift.Cassandra$P= rocessor$batch_mutate.getResult(Cassandra.java:3622)
at org.apache.cass= andra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3610= )
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)=
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(= CustomTThreadPoolServer.java:201)
at java.util.concurrent.ThreadPoolExe= cutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPool= Executor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Sou= rce)

 

 

= --_000_97FB38514D40D84A994DD3959447966839970DE4D5SMCALAVXMB06B_--