cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime <>
Subject replication_factor mismatch
Date Tue, 11 Nov 2014 16:36:15 GMT
Hello, I have a curious behaviour occurring.

- 7 Nodes custer
- RF on the Keyspace is 3
- Latest version of everything (C* and Python Drivers)
- All queries are at QUORUM level

Some of my larger queries are timing out, which is ok, it can happen. But
looking at the log, I see the following:

ReadTimeout: code=1200 [Timeout during read request] message="Operation
timed out - received only 2 responses." info={'received_responses': 2,
'data_retrieved': True, 'required_responses': 3, 'consistency': 5}

So the part confusing me is the "consistency", it says 5 while I would
normally expect 3 (the RF). So I received 2 responses, which should be ok
in a Quorum of RF 3 (quorum 1/2 of 3 = 2).

Why is the consistency 5? Could it be because the data is actually located
physically on 5 nodes despite the RF of 3? I ask about this possibility
because I know for a fact my cluster is not in a good repaired state. My
attempts at repairing resulted in different OOMs and extreme numbers of
SSTables (which I assume are all remnants of my previous issues with C* and
Secondary Indexes (!!!)). I've had to reboot the nodes after each attempt
and do a cleanup, but something tells me things are still messed up.

Is the python driver somehow automatically determining where the data is
located (despite the RF being different) and using this number instead of
the RF in the Quorum computation?

View raw message