cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Morton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)
Date Tue, 01 Feb 2011 20:37:29 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989375#comment-12989375
] 

Aaron Morton commented on CASSANDRA-2081:
-----------------------------------------

My understanding here is the 0.19 node is sending read requests to the 0.1, 0.2 and 0.3 nodes
and only getting a reply from the 0.1 node before timing out. The 0.1 node is the first node
the request is sent to, so this is the data request the others are digest. 

The timeout is the rpc_timeout, and can be seen here...

DEBUG [pool-1-thread-1] 2011-02-01 11:48:28,949 ReadCallback.java (line 58) ReadCallback blocking
for 2 responses
...10 seconds... 
DEBUG [pool-1-thread-1] 2011-02-01 11:48:38,950 CassandraServer.java (line 483) ... timed
out

Whats happening on the 0.2 and 0.3 nodes at this point? Are they logging errors or WARN messages
about dropped messages ? Can you see any logs about processing messages from the 0.19 node?
I'm not sure the down 0.18 node is a factor here.

The client should be retrying when it gets a timeout, which I think you said Hector was doing.


 

> Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2081
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: linux, hector + cassandra
>            Reporter: Thibaut
>            Priority: Blocker
>             Fix For: 0.7.1
>
>
> I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
> Using consistency level Quorum won't work anymore (tested it on read). Consisteny level
ONE still works though
> I have tried this with one dead node in my cluster.
> If I restart cassandra with an older svn revision (apache-cassandra-2011-01-28_20-06-01.jar),
I can access the cluster with consistency level QUORUM again, while still using apache-cassandra-2011-01-28_20-06-01.jar
and hector 7.0.25 in my application.
> 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed intr1n18(192.168.0.18):9160
host still appears to be down: Unable to open transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException:
No route to host
> 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host retry status
false with host: intr1n18(192.168.0.18):9160
> 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill request on
this host CassandraClient<intr1n11:9160-483>
> intr1n11 is marked as up however and I can also access the node through the cassandra
cli.
> 192.168.0.1     Up     Normal  8.02 GB         5.00%   0cc
> 192.168.0.2     Up     Normal  7.96 GB         5.00%   199
> 192.168.0.3     Up     Normal  8.24 GB         5.00%   266
> 192.168.0.4     Up     Normal  4.94 GB         5.00%   333
> 192.168.0.5     Up     Normal  5.02 GB         5.00%   400
> 192.168.0.6     Up     Normal  5 GB            5.00%   4cc
> 192.168.0.7     Up     Normal  5.1 GB          5.00%   599
> 192.168.0.8     Up     Normal  5.07 GB         5.00%   666
> 192.168.0.9     Up     Normal  4.78 GB         5.00%   733
> 192.168.0.10    Up     Normal  4.34 GB         5.00%   7ff
> 192.168.0.11    Up     Normal  5.01 GB         5.00%   8cc
> 192.168.0.12    Up     Normal  5.31 GB         5.00%   999
> 192.168.0.13    Up     Normal  5.56 GB         5.00%   a66
> 192.168.0.14    Up     Normal  5.82 GB         5.00%   b33
> 192.168.0.15    Up     Normal  5.57 GB         5.00%   c00
> 192.168.0.16    Up     Normal  5.03 GB         5.00%   ccc
> 192.168.0.17    Up     Normal  4.77 GB         5.00%   d99
> 192.168.0.18    Down   Normal  ?               5.00%   e66
> 192.168.0.19    Up     Normal  4.78 GB         5.00%   f33
> 192.168.0.20    Up     Normal  4.83 GB         5.00%   ffffffffffffffff

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message