cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Xue <martin...@gmail.com>
Subject What happened about one node in cluster is down?
Date Fri, 02 Aug 2019 07:21:46 GMT
Hello,

I am currently running into a production issue, and seek help from the
community to help.

Can anyone help with the following question regarding the Cassandra down
node inside cluster?

Case:
Cassandra 3.0.14
3 nodes (A, B, C) in DC1, 3 nodes (D, E, F) in DC2 forming one cluster

keyspace_m: Replication Factor is 2 in DC1, and DC2

application_z read and write consistency is both local quorum


Issue:
node A in DC1 has crashed, and has been down for more than 24 hours,
(outside the default hint3 hours window).

Questions:
1. for old data in node A, will the data be re-sync to node B, or C after
node A was down?
2. for new data, if application_z is trying to write, will the data be
always written to the only two running nodes (B and C) in DC1, or it will
fail if it still tries to write to node A?
3. if application_z is to read, will it fail (for old data before node A
crash and for new data after node A crash)? will the data be replicated
from A to B or C?
3. what is the best strategy under this senario?
4. Shall I bring up the node A and run repair on all the nodes (A, B, C, D,
E, F)
(a potential issue, as repair may cause the similar crash happened on node
A , and there are big 1TB keyspace to repair)
5. Shall I simply just decommision node A, and add new node F into DC1 into
cluster?


Your help would be appreciated.

Thanks
Regards
Martin

Mime
View raw message