cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3166) Rolling upgrades from 0.7 to 0.8 not possible
Date Fri, 09 Sep 2011 15:05:08 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101259#comment-13101259
] 

Peter Schuller commented on CASSANDRA-3166:
-------------------------------------------

Removing the resetVersion() did not help. I added some logging to IncomingTcpConnection and
it seems that when the 0.8 node goes up first, the 0.7 node never tries to make an outgoing
connection to it.

If my understanding is correct, from reading CASSANDRA-2818 and looking at the code, I think
the intent is that we discover the version of the other guy whenever that guy connects to
*us*; we can never find out that the other side has a mis-matched version based on activity
on the outbound connection.

So, incoming connections would be a necessity in order for the 0.8 node to ever adjust it's
lingo.

> Rolling upgrades from 0.7 to 0.8 not possible
> ---------------------------------------------
>
>                 Key: CASSANDRA-3166
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3166
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7.5, 0.7.9, 0.8.4
>            Reporter: Marcus Eriksson
>
> We are in the progress of upgrading to 0.8 and we need to do a rolling upgrade, this
fails miserably and it is reproducible;
> 1. set up a 3 node cluster with 0.7.9 and rf=3, read and write, QUORUM
> 2. upgrade one of the nodes (i upped a seednode, not sure if that is important)
> 3. continue reading/writing
> 4. see logs on the 0.7 node fill up with: INFO 12:36:08,240 Received connection from
newer protocol version. Ignorning message.
> it does work if i start the 0.7.9 nodes *after* the 0.8.4 node which makes me think that
it matters if it is the 0.8 node connecting to the 0.7 nodes or the other way round.
> Debug logging on the 0.8 node shows:
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-82] 2011-09-09 11:55:06,067 StorageProxy.java
(line 178) Write timeout java.util.concurrent.TimeoutException for one (or more) of: 
> /var/log/cassandra/system.log.9:DEBUG [pool-2-thread-76] 2011-09-09 11:55:06,067 StorageProxy.java
(line 584) Read timeout: java.util.concurrent.TimeoutException: Operation timed out - received
only 1 responses from /193.182.3.92,  .
> nothing except for the "newer protocol version..." in the 0.7-logs
> i will continue to look at this issue but if anyone has a quick patch, let me know

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message