cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3804) upgrade problems from 1.0 to trunk
Date Mon, 30 Jan 2012 15:32:10 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196176#comment-13196176
] 

Pavel Yaskevich commented on CASSANDRA-3804:
--------------------------------------------

This exception (taken from Sylvain's #2) explains what will happen when you only partially
migrate:

{noformat}
ERROR [GossipStage:1] 2012-01-30 14:35:13,363 AbstractCassandraDaemon.java (line 139) Fatal
exception in thread Thread[GossipStage:1,5,main]
java.lang.UnsupportedOperationException: Not a time-based UUID
        at java.util.UUID.timestamp(UUID.java:308)
        at org.apache.cassandra.service.MigrationManager.updateHighestKnown(MigrationManager.java:121)
        at org.apache.cassandra.service.MigrationManager.rectify(MigrationManager.java:99)
        at org.apache.cassandra.service.MigrationManager.onAlive(MigrationManager.java:83)
        at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:806)
        at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:849)
        at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:908)
        at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:68)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{noformat} 

As we switched from Time-based UUID for schema versions MigrationManager on the old nodes
will fail all the time when nodes with new schema start-up or when they will request migrations
from it (because they see that their schema version is different from others). Even if we
make a fix in MigrationManager.rectify(...) method for 1.0.x, nodes with new/old schema will
never come to agreement because of different types of the UUID and because they unable to
run schema mutations anymore.
                
> upgrade problems from 1.0 to trunk
> ----------------------------------
>
>                 Key: CASSANDRA-3804
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3804
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>         Environment: ubuntu, cluster set up with ccm.
>            Reporter: Tyler Patterson
>            Assignee: Pavel Yaskevich
>             Fix For: 1.1
>
>
> A 3-node cluster is on version 0.8.9, 1.0.6, or 1.0.7 and then one and only one node
is taken down, upgraded to trunk, and started again. An rpc timeout exception happens if counter-add
operations are done. It usually takes between 1 and 500 add operations before the failure
occurs. The failure seems to happen sooner if the coordinator node is NOT the one that was
upgraded. Here is the error: 
> {code}
> ======================================================================
> ERROR: counter_upgrade_test.TestCounterUpgrade.counter_upgrade_test
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/usr/lib/pymodules/python2.7/nose/case.py", line 187, in runTest
>     self.test(*self.arg)
>   File "/home/tahooie/cassandra-dtest/counter_upgrade_test.py", line 50, in counter_upgrade_test
>     cursor.execute("UPDATE counters SET row = row+1 where key='a'")
>   File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 96, in execute
>     raise cql.OperationalError("Request did not complete within rpc_timeout.")
> OperationalError: Request did not complete within rpc_timeout.
> {code}
> A script has been added to cassandra-dtest (counter_upgrade_test.py) to demonstrate the
failure. The newest version of CCM is required to run the test. It is available here if it
hasn't yet been pulled: git@github.com:tpatterson/ccm.git

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message