cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From onmstester onmstester <onmstes...@zoho.com>
Subject Re: node replacement failed
Date Mon, 10 Sep 2018 12:42:48 GMT
Thanks Alain, First here it is more detail about my cluster: 10 racks + 3 nodes on each rack
nodetool status: shows 27 nodes UN and 3 nodes all related to single rack as DN version 3.11.2
Option 1: (Change schema and) use replace method (preferred method) * Did you try to have
the replace going, without any former repairs, ignoring the fact 'system_traces' might be
inconsistent? You probably don't care about this table, so if Cassandra allows it with some
of the nodes down, going this way is relatively safe probably. I really do not see what you
could lose that matters in this table. * Another option, if the schema first change was accepted,
is to make the second one, to drop this table. You can always rebuild it in case you need
it I assume. I really love to let the replace going, but it stops with the error: java.lang.IllegalStateException:
unable to find sufficient sources for streaming range in keyspace system_traces Also i could
delete system_traces which is empty anyway, but there is a system_auth and system_distributed
keyspace too and they are not empty, Could i delete them safely too? If i could just somehow
skip streaming the system keyspaces from node replace phase, the option 1 would be great.
P.S: Its clear to me that i should use at least RF=3 in production, but could not manage to
acquire enough resources yet (i hope would be fixed in recent future) Again Thank you for
your time Sent using Zoho Mail ---- On Mon, 10 Sep 2018 16:20:10 +0430 Alain RODRIGUEZ <arodrime@gmail.com>
wrote ---- Hello, I am sorry it took us (the community) more than a day to answer to this
rather critical situation. That being said, my recommendation at this point would be for you
to make sure about the impacts of whatever you would try. Working on a broken cluster, as
an emergency might lead you to a second mistake, possibly more destructive than the first
one. It happened to me and around, for many clusters. Move forward even more carefuly in these
situations as a global advice. Suddenly i lost all disks of cassandar-data on one of my racks
With RF=2, I guess operations use LOCAL_ONE consistency, thus you should have all the data
in the safe rack(s) with your configuration, you probably did not lose anything yet and have
the service only using the nodes up, that got the right data.  tried to replace the nodes
with same ip using this: https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html
As a side note, I would recommend you to use 'replace_address_first_boot' instead of 'replace_address'.
This does basically the same but will be ignored after the first bootstrap. A detail, but
hey, it's there and somewhat safer, I would use this one. java.lang.IllegalStateException:
unable to find sufficient sources for streaming range in keyspace system_traces By default,
non-user keyspace use 'SimpleStrategy' and a small RF. Ideally, this should be changed in
a production cluster, and you're having an example of why. Now when i altered the system_traces
keyspace startegy to NetworkTopologyStrategy and RF=2 but then running nodetool repair failed:
Endpoint not alive /IP of dead node that i'm trying to replace. Changing the replication strategy
you made the dead rack owner of part of the token ranges, thus repairs just can't work as
there will always be one of the nodes involved down as the whole rack is down. Repair won't
work, but you probably do not need it! 'system_traces' is a temporary / debug table. It's
probably empty or with irrelevant data. Here are some thoughts: * It would be awesome at this
point for us (and for you if you did not) to see the status of the cluster: ** 'nodetool status'
** 'nodetool describecluster' --> This one will tell if the nodes agree on the schema
(nodes up). I have seen schema changes with nodes down inducing some issues. ** Cassandra
version ** Number of racks (I assumer #racks >= 2 in this email) Option 1: (Change schema
and) use replace method (preferred method) * Did you try to have the replace going, without
any former repairs, ignoring the fact 'system_traces' might be inconsistent? You probably
don't care about this table, so if Cassandra allows it with some of the nodes down, going
this way is relatively safe probably. I really do not see what you could lose that matters
in this table. * Another option, if the schema first change was accepted, is to make the second
one, to drop this table. You can always rebuild it in case you need it I assume. Option 2:
Remove all the dead nodes (try to avoid this option 2, if option 1 works, it is better). Please
do not take an apply this like this. It's a thought on how you could get rid of the issue,
yet it's rather brutal and risky and I did not consider it deeply and have no clue about your
architecture and the context. Consider it carefully on your side. * You can also 'nodetool
removenode' on each of the dead nodes. This will have nodes streaming around and the rack
isolation guarantee will no longer be valid. It's hard to reason about what would happen
to the data and in terms of streaming. * Alternatively, if you don't have enough space, you
can even 'force' the 'nodetool removenode'. See the documentation. Forcing it will prevent
streaming and remove the node (token ranges handover, but not the data). If that does not
work you can use the 'nodetool assassinate' command as well. When adding nodes back to the
broken DC, the first nodes will take probably 100% of the ownership, which is often too much.
You can consider adding back all the nodes with 'auto_bootstrap: false' before repairing them
once they have their final token ownership, the same ways we do when building a new data center.
This option is not really clean, and have some caveats that you need to consider before starting
as there are token range movements and nodes available that do not have the data. Yet this
should work. I imagine it would work nicely with RF=3 and QUORUM and with RF=2 (if you have
2+ racks), I guess it should work as well but you will have to pick one of availability or
consistency while repairing the data. Be aware that read requests hitting these nodes will
not find data! Plus, you are using an RF=2. Thus using consistency of 2+ (TWO, QUORUM, ALL),
for at least one of reads or writes is needed to preserve consistency while re-adding the
nodes in this case. Otherwise, reads will not detect the mismatch with certainty and might
show inconsistent data the time for the nodes to be repaired. I must say, that I really prefer
odd values for the RF, starting with RF=3. Using RF=2 you will have to pick. Consistency or
Availability. With a consistency of ONE everywhere, the service is available, no single point
of failure. using anything bigger than this, for writes or read, brings consistency but it
creates single points of failures (actually any node becomes a point of failure). RF=3 and
QUORUM for both write and reads take the best of the 2 worlds somehow. The tradeoff with
RF=3 and quorum reads is the latency increase and the resource usage. Maybe is there a better
approach, I am not too sure, but I think I would try option 1 first in any case. It's less
destructive, less risky, no token range movements, no empty nodes available. I am not sure
about limitation you might face though and that's why I suggest a second option for you to
consider if the first is not actionable. Let us know how it goes, C*heers, -----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com France / Spain The Last Pickle - Apache
Cassandra Consulting http://www.thelastpickle.com Le lun. 10 sept. 2018 à 09:09, onmstester
onmstester <onmstester@zoho.com> a écrit : Any idea? Sent using Zoho Mail ---- On
Sun, 09 Sep 2018 11:23:17 +0430 onmstester onmstester <onmstester@zoho.com> wrote ----
Hi, Cluster Spec: 30 nodes RF = 2 NetworkTopologyStrategy GossipingPropertyFileSnitch + rack
aware Suddenly i lost all disks of cassandar-data on one of my racks, after replacing the
disks, tried to replace the nodes with same ip using this: https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html
starting the to-be-replace-node fails with: java.lang.IllegalStateException: unable to find
sufficient sources for streaming range in keyspace system_traces the problem is that i did
not changed default replication config for System keyspaces, but Now when i altered the system_traces
keyspace startegy to NetworkTopologyStrategy and RF=2 but then running nodetool repair failed:
Endpoint not alive /IP of dead node that i'm trying to replace. What should i do now? Can
i just remove previous nodes, change dead nodes IPs and re-join them to cluster? Sent using
Zoho Mail
Mime
View raw message