cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: Read Consistency
Date Tue, 30 Jun 2015 17:27:55 GMT
Agree Tyler. I think its our application problem. If client returns failed write in spite of
retries, application must have a rollback mechanism to make sure old state is restored. Failed
write may be because of the fact that CL was not met even though one node successfully wrote.Cassandra
wont do cleanup or rollback on one node so you need to do it yourself to make sure that integrity
of data is maintained in case strong consistency is a requirement. Right?


We use Hector by the way and plannning to switch to CQL driver..



Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:"Tyler Hobbs" <tyler@datastax.com>
Date:Tue, 30 Jun, 2015 at 10:42 pm
Subject:Re: Read Consistency


I think these scenarios are still possible even when we are writing at QUORUM ..if we have
dropped mutations in our cluster..

It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had dropped mutations
for long time but we never faced any scenario like scenario 1 when READ went to node 2 and
3 and read did's return any data..Any comments on this are welcome?? 


They are not possible if you write at QUORUM, because QUORUM guarantees that at least two
of the nodes will have the most recent version of the data.  If fewer than two replicas respond
successfully (meaning two replicas dropped mutations), you will get an error on the write.

All of the drivers and cqlsh default to consistency level ONE, so I would double check that
your application is setting the consistency level correctly.

 


On Sun, Jun 28, 2015 at 12:55 PM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:

Sorry for typo in your name Owen !!


Anuj

Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2003@yahoo.co.in>
Date:Sun, 28 Jun, 2015 at 11:11 pm
Subject:Re: Read Consistency

Agree Owem !! Response in both scenarios would depend on the 2 replicas chosen for meeting
QUORUM. But, the intent is to get the tricky part of scenario 1 answered i.e. when 2 nodes
selected are one with and one without data. 


As per my understanding of Read Path and documentation https://wiki.apache.org/cassandra/ArchitectureInternals:

1. Data would be read from closest node and digest would be received from one more replica.

2. If mismatch is found between digest, blocked read happens on same 2 replicas (not all replicas
..so in scenario 2, if 2 nodes didnt have latest data and third node has it ..still stale
data would be returned)


I think these scenarios are still possible even when we are writing at QUORUM ..if we have
dropped mutations in our cluster..

It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had dropped mutations
for long time but we never faced any scenario like scenario 1 when READ went to node 2 and
3 and read did's return any data..Any comments on this are welcome?? 


Thanks for clarifying further as discussion could have mislead few..


Thanks

Anuj




On Sunday, 28 June 2015 6:16 AM, Owen Kim <ohechkay@gmail.com> wrote:



Sorry. I have to jump in and disagree. Data is not guaranteed to retire in scenario 1. Since
two nodes do not have data and two nodes may be the only nodes queried at that CL, the read
query may return data or not.


Similarly, in scenario 2, the query may not return the most recent data because the node with
that data may not be queried at all (the other two may).


Keep in mind, these scenarios seem to generally assume you are not writing data at consistently
at QUORUM CL so therefore your reads may be inconsistent.


On Tuesday, June 23, 2015, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:

Thanks..So all of us agree that in scenario 1, data would be returned and that was my initial
understanding..



Anuj




Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2003@yahoo.co.in>
Date:Wed, 24 Jun, 2015 at 12:15 am
Subject:Re: Read Consistency

M more confused...Different responses. .Anyone who can explain with 100% surity ?


Thanks

Anuj



Sent from Yahoo Mail on Android

From:"arun sirimalla" <arunsirik@gmail.com>
Date:Wed, 24 Jun, 2015 at 12:00 am
Subject:Re: Read Consistency



Thanks good to know that.


On Tue, Jun 23, 2015 at 11:27 AM, Philip Thompson <philip.thompson@datastax.com> wrote:

Yes, that is what he means. CL is for how many nodes need to respond, not agree.


On Tue, Jun 23, 2015 at 2:26 PM, arun sirimalla <arunsirik@gmail.com> wrote:

So do you mean with CL set to QUORUM, if data is only on one node, the query still succeeds.


On Tue, Jun 23, 2015 at 11:21 AM, Philip Thompson <philip.thompson@datastax.com> wrote:

Anuj,

In the first scenario, the data from the single node holding data is returned. The query will
not fail if the consistency level is met, even if the read was inconsistent.


On Tue, Jun 23, 2015 at 2:16 PM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:

Why would it fail and with what Thrift error? What if the data didnt exist on any of the nodes..query
wont fail if doesnt find data..


Not convinced..

Sent from Yahoo Mail on Android

From:"arun sirimalla" <arunsirik@gmail.com>
Date:Tue, 23 Jun, 2015 at 11:39 pm
Subject:Re: Read Consistency

Scenario 1: Read query is fired for a key, data is found on one node and not found on other
two nodes who are responsible for the token corresponding to key.


You read query will fail, as it expects to receive data from 2 nodes with RF=3



Scenario 2: Read query is fired and all 3 replicas have different data with different timestamps.


Read query will return the data with most recent timestamp and trigger a read repair in the
backend .


On Tue, Jun 23, 2015 at 10:57 AM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:

Hi,


Need to validate my understanding..


RF=3 , Read CL = Quorum


What would be returned to the client in following scenarios:


Scenario 1: Read query is fired for a key, data is found on one node and not found on other
two nodes who are responsible for the token corresponding to key.


Options: no data is returned OR data from the only node having data is returned?


Scenario 2: Read query is fired and all 3 replicas have different data with different timestamps.


Options: data with latest timestamp is returned OR something else???


Thanks

Anuj


Sent from Yahoo Mail on Android




-- 

Arun 





-- 

Arun 

Senior Hadoop/Cassandra Engineer

Cloudwick



2014 Data Impact Award Winner (Cloudera)

http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html






-- 

Arun 

Senior Hadoop/Cassandra Engineer

Cloudwick



2014 Data Impact Award Winner (Cloudera)

http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html







-- 

Tyler Hobbs
DataStax


Mime
View raw message