incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Lalevée <nicolas.lale...@hibnet.org>
Subject Upgrade 1.2.11 to 2.0.6: some errors
Date Wed, 19 Mar 2014 13:11:28 GMT
Hi,

On our test cluster, we tried a upgrade of Cassandra from 1.22.1 to 2.0.6. It was not straight
forward so I would like to know if it is expected, so I can do it safely on prod.

The first time we tried, the first upgrading node refused to start with this error:

ERROR [main] 2014-03-19 10:50:31,363 CassandraDaemon.java (line 488) Exception encountered
during startup
java.lang.RuntimeException: Incompatible SSTable found.  Current version jb is unable to read
file: /var/lib/cassandra/d
ata/system/NodeIdInfo/system-NodeIdInfo-hf-4.  Please run upgradesstables.
        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:415)
        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
        at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:309)
        at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:266)
        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110)
        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88)
        at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:514)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:237)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:471)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:560)

I've read again the NEWS.txt [1], and as far as I understand, upgradesstables is only required
for < 1.2.9. But maybe I don't understand correctly the paragraph:
    - Upgrading is ONLY supported from Cassandra 1.2.9 or later. This
      goes for sstable compatibility as well as network.  When
      upgrading from an earlier release, upgrade to 1.2.9 first and
      run upgradesstables before proceeding to 2.0.

So we did the required upgradesstables. The node started successfully.

I have checked on our prod cluster, there is also some hf files, on all nodes, all being /var/lib/cassandra/data/system/Versions/system-Versions-hf-*
And I have tried many upgradesstables command, there are still lying there.
# nodetool upgradesstables system Versions
Exception in thread "main" java.lang.IllegalArgumentException: Unknown table/cf pair (system.Versions)
# nodetool upgradesstables system
# nodetool upgradesstables
# nodetool upgradesstables -a system
# ls /var/lib/cassandra/data/system/Versions/*-hf-* | wc -l
15

I did not try "nodetool upgradesstables -a" since we have a lot of data.

I guess this will cause me trouble if I try to upgrade in prod ? Is there a bug I should report
?

Continuing on our test cluster, we upgraded the second node. And during the time we were running
with 2 different versions of cassandra, there was errors in the logs:

ERROR [WRITE-/10.10.0.41] 2014-03-19 11:23:27,523 OutboundTcpConnection.java (line 234) error
writing to /10.10.0.41
java.lang.RuntimeException: Cannot convert filter to old super column format. Update all nodes
to Cassandra 2.0 first.
        at org.apache.cassandra.db.SuperColumns.sliceFilterToSC(SuperColumns.java:357)
        at org.apache.cassandra.db.SuperColumns.filterToSC(SuperColumns.java:258)
        at org.apache.cassandra.db.ReadCommandSerializer.serializedSize(ReadCommand.java:192)
        at org.apache.cassandra.db.ReadCommandSerializer.serializedSize(ReadCommand.java:134)
        at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:116)
        at org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:251)
        at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:203)
        at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:151)

I confirm we do have old style super columns which were designed when cassandra was 1.0.x.
Since in our test cluster the replication factor is 1, I can see errors on the client side,
since 1 node among 2 was down. So I don't know for sure if this error in cassandra affected
the client, the time frame is too short to be sure from the logs. In prod we have a replication
factor of 3. If we'll do a such upgrade in prod, node by node to avoid any downtime, will
the client still see write errors during the time there will be mixed versions of cassandra
?

Nicolas

[1] https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.0.6


Mime
View raw message