cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Baynes <stephen.bay...@smoothwall.net>
Subject Re: Changing schema on multiple nodes while they are isolated
Date Fri, 02 Oct 2015 16:08:24 GMT
Hi Jacques-Henri

You are right - serious trouble. I managed some more testing and it does
not repair or share any data. In the logs I see lots of:

WARN  [MessagingService-Incoming-/10.50.16.214] 2015-10-02 16:52:36,810
IncomingTcpConnection.java:100 - UnknownColumnFamilyException reading from
socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find
cfId=e6828dd0-691a-11e5-8a27-b1780df21c7c
at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:163)
~[apache-cassandra-2.2.1.jar:2.2.1]
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:96)
~[apache-cassandra-2.2.1.jar:2.2.1]

and some:

ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,546
RepairMessageVerbHandler.java:164 - Got error, removing parent repair
session
ERROR [AntiEntropyStage:1] 2015-10-02 16:48:16,548 CassandraDaemon.java:183
- Exception in thread Thread[AntiEntropyStage:1,5,main]
java.lang.RuntimeException: java.lang.NullPointerException
at
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:167)
~[apache-cassandra-2.2.1.jar:2.2.1]
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
~[apache-cassandra-2.2.1.jar:2.2.1]


Will need to do some thinking about this. I wonder about shiping a backup
of a good system keyspace and restore it on each node before it starts for
the first time - but will that end up with each node having the same
internal id?



On 2 October 2015 at 16:27, Jacques-Henri Berthemet <
jacques-henri.berthemet@genesys.com> wrote:

> Hi Stephen,
>
>
>
> If you manage to create tables on each node while node A and B are
> separated, you’ll get into troubles when they will reconnect again. I had
> the case previously and Cassandra complained that tables with same names
> but different ids were present in the keyspace. I don’t know if there is a
> way to fix that with nodetool but I don’t think that it is a good practice.
>
>
>
> To solve this, we have a “schema creator” application node that is
> responsible to change the schema. If this node is down, schema updates are
> not possible. We can make any node ‘creator’, but only one can be enabled
> at any given time.
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* Stephen Baynes [mailto:stephen.baynes@smoothwall.net]
> *Sent:* vendredi 2 octobre 2015 16:46
> *To:* user@cassandra.apache.org
> *Subject:* Changing schema on multiple nodes while they are isolated
>
>
>
> Is it safe to make schema changes ( e.g. create keyspace and tables ) on
> multiple separate nodes of a cluster while they are out of communication
> with other nodes in the cluster? For example create on node A while node B
> is down, create on node B while A is down, then bring both up together.
>
>
>
> We are looking to embed Cassandra invisibly in another product and we have
> no control in what order users may start/stop the nodes up or add/remove
> them from clusters. And Cassandra must come up and be working with at least
> local access regardless. So this means always creating keyspaces and tables
> so they are always present. But this means nodes joining clusters which
> already have the same keyspace and table defined. Will it cause any issues?
> I have done some testing and saw some some issues when I tried to nodetool
> repair to bring things into sync. However at the time I was fighting with
> what I later discovered was CASSANDRA-9689 keyspace does not show in
> describe list, if create query times out.
> <https://issues.apache.org/jira/browse/CASSANDRA-9689> and did not know
> what was what. I will give it another try sometime, but would appreciate
> knowing if this is going to run into trouble before we find it.
>
>
>
> We are basically using Cassandra to share fairly transient information We
> can cope with data loss during environment changes and occasional losses at
> other times. But if the environment is stable then it should all just work,
> whatever the environment is. We use a very high replication factor so all
> nodes have a copy of all the data and will keep working even if they are
> the only one up.
>
>
>
> Thanks
>
>
>
> --
>
> *Stephen Baynes*
>


Thanks
-- 

Stephen Baynes

Mime
View raw message