Return-Path: Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: (qmail 17340 invoked from network); 1 Feb 2011 20:59:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Feb 2011 20:59:33 -0000 Received: (qmail 63041 invoked by uid 500); 1 Feb 2011 20:59:33 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 62959 invoked by uid 500); 1 Feb 2011 20:59:32 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 62951 invoked by uid 99); 1 Feb 2011 20:59:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 20:59:32 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 20:59:31 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 3CA29180067 for ; Tue, 1 Feb 2011 20:37:29 +0000 (UTC) Date: Tue, 1 Feb 2011 20:37:29 +0000 (UTC) From: "Aaron Morton (JIRA)" To: commits@cassandra.apache.org Message-ID: <404655235.3573.1296592649244.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <774183599.403.1296502828996.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989375#comment-12989375 ] Aaron Morton commented on CASSANDRA-2081: ----------------------------------------- My understanding here is the 0.19 node is sending read requests to the 0.1, 0.2 and 0.3 nodes and only getting a reply from the 0.1 node before timing out. The 0.1 node is the first node the request is sent to, so this is the data request the others are digest. The timeout is the rpc_timeout, and can be seen here... DEBUG [pool-1-thread-1] 2011-02-01 11:48:28,949 ReadCallback.java (line 58) ReadCallback blocking for 2 responses ...10 seconds... DEBUG [pool-1-thread-1] 2011-02-01 11:48:38,950 CassandraServer.java (line 483) ... timed out Whats happening on the 0.2 and 0.3 nodes at this point? Are they logging errors or WARN messages about dropped messages ? Can you see any logs about processing messages from the 0.19 node? I'm not sure the down 0.18 node is a factor here. The client should be retrying when it gets a timeout, which I think you said Hector was doing. > Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host) > ----------------------------------------------------------------------------------------- > > Key: CASSANDRA-2081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2081 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: linux, hector + cassandra > Reporter: Thibaut > Priority: Blocker > Fix For: 0.7.1 > > > I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25. > Using consistency level Quorum won't work anymore (tested it on read). Consisteny level ONE still works though > I have tried this with one dead node in my cluster. > If I restart cassandra with an older svn revision (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with consistency level QUORUM again, while still using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application. > 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: No route to host > 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host retry status false with host: intr1n18(192.168.0.18):9160 > 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill request on this host CassandraClient > intr1n11 is marked as up however and I can also access the node through the cassandra cli. > 192.168.0.1 Up Normal 8.02 GB 5.00% 0cc > 192.168.0.2 Up Normal 7.96 GB 5.00% 199 > 192.168.0.3 Up Normal 8.24 GB 5.00% 266 > 192.168.0.4 Up Normal 4.94 GB 5.00% 333 > 192.168.0.5 Up Normal 5.02 GB 5.00% 400 > 192.168.0.6 Up Normal 5 GB 5.00% 4cc > 192.168.0.7 Up Normal 5.1 GB 5.00% 599 > 192.168.0.8 Up Normal 5.07 GB 5.00% 666 > 192.168.0.9 Up Normal 4.78 GB 5.00% 733 > 192.168.0.10 Up Normal 4.34 GB 5.00% 7ff > 192.168.0.11 Up Normal 5.01 GB 5.00% 8cc > 192.168.0.12 Up Normal 5.31 GB 5.00% 999 > 192.168.0.13 Up Normal 5.56 GB 5.00% a66 > 192.168.0.14 Up Normal 5.82 GB 5.00% b33 > 192.168.0.15 Up Normal 5.57 GB 5.00% c00 > 192.168.0.16 Up Normal 5.03 GB 5.00% ccc > 192.168.0.17 Up Normal 4.77 GB 5.00% d99 > 192.168.0.18 Down Normal ? 5.00% e66 > 192.168.0.19 Up Normal 4.78 GB 5.00% f33 > 192.168.0.20 Up Normal 4.83 GB 5.00% ffffffffffffffff -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira