incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Mutation dropped
Date Mon, 25 Feb 2013 02:14:00 GMT
> Aaron, what did u mean with RF3 CLQuorum is more a real world scenario?
> If there are only 2 nodes, where will be placed the third replica?
You would need a 3rd node. 

Running with less that RF 2 means you cannot have a strongly consistent system and fault tolerance.
See http://thelastpickle.com/2011/06/13/Down-For-Me/

> By increasing the CL wont it decrease the performance on w/r and then increase the timeoutexceptions
of this mentioned case?
Sort of.
It will take the cluster as a whole more effort to process a read at QUOURM with RF 3 than
RF2 and CL 1. 
But I would not recommend using  RF 2 and CL 1 unless you have a good understanding of how
that affects your consistency and availability. 

If you are just starting out with cassandra, RF 3 and CL QUOURM is the best approach IMHO.


Cheers
  
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 24/02/2013, at 6:41 AM, Víctor Hugo Oliveira Molinar <vhmolinar@gmail.com> wrote:

> Aaron, what did u mean with RF3 CLQuorum is more a real world scenario?
> If there are only 2 nodes, where will be placed the third replica?
> By increasing the CL wont it decrease the performance on w/r and then increase the timeoutexceptions
of this mentioned case?
> 
> 
> On Fri, Feb 22, 2013 at 1:59 PM, aaron morton <aaron@thelastpickle.com> wrote:
> If you are running repair, using QUORUM, and there are not dropped writes you should
not be getting DigestMismatch during reads. 
> 
> If everything else looks good, but the request latency is higher than the CF latency
I would check that client load is evenly distributed. Then start looking to see if the request
throughput is at it's maximum for the cluster. 
> 
> Cheers
>   
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 22/02/2013, at 8:15 PM, Wei Zhu <wz1975@yahoo.com> wrote:
> 
>> Thanks Aaron for the great information as always. I just checked cfhistograms and
only a handful of read latency are bigger than 100ms, but for proxyhistograms there are 10
times more are greater than 100ms. We are using QUORUM  for reading with RF=3, and I understand
coordinator needs to get the digest from other nodes and read repair on the miss match etc.
But is it normal to see the latency from proxyhistograms to go beyond 100ms? Is there anyway
to improve that? 
>> We are tracking the metrics from Client side and we see the 95th percentile response
time averages at 40ms which is a bit high. Our 50th percentile was great under 3ms. 
>> 
>> Any suggestion is very much appreciated.
>> 
>> Thanks.
>> -Wei
>> 
>> ----- Original Message -----
>> From: "aaron morton" <aaron@thelastpickle.com>
>> To: "Cassandra User" <user@cassandra.apache.org>
>> Sent: Thursday, February 21, 2013 9:20:49 AM
>> Subject: Re: Mutation dropped
>> 
>>> What does rpc_timeout control? Only the reads/writes? 
>> Yes. 
>> 
>>> like data stream,
>> streaming_socket_timeout_in_ms in the yaml
>> 
>>> merkle tree request? 
>> Either no time out or a number of days, cannot remember which right now. 
>> 
>>> What is the side effect if it's set to a really small number, say 20ms?
>> You will probably get a lot more requests that fail with a TimedOutException. 
>> 
>> rpc_timeout needs to be longer than the time it takes a node to process the message,
and the time it takes the coordinator to do it's thing. You can look at cfhistograms and proxyhistograms
to get a better idea of how long a request takes in your system.  
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 21/02/2013, at 6:56 AM, Wei Zhu <wz1975@yahoo.com> wrote:
>> 
>>> What does rpc_timeout control? Only the reads/writes? How about other inter-node
communication, like data stream, merkle tree request?  What is the reasonable value for roc_timeout?
The default value of 10 seconds are way too long. What is the side effect if it's set to a
really small number, say 20ms?
>>> 
>>> Thanks.
>>> -Wei
>>> 
>>> From: aaron morton <aaron@thelastpickle.com>
>>> To: user@cassandra.apache.org 
>>> Sent: Tuesday, February 19, 2013 7:32 PM
>>> Subject: Re: Mutation dropped
>>> 
>>>> Does the rpc_timeout not control the client timeout ?
>>> No it is how long a node will wait for a response from other nodes before raising
a TimedOutException if less than CL nodes have responded. 
>>> Set the client side socket timeout using your preferred client. 
>>> 
>>>> Is there any param which is configurable to control the replication timeout
between nodes ?
>>> There is no such thing.
>>> rpc_timeout is roughly like that, but it's not right to think about it that way.

>>> i.e. if a message to a replica times out and CL nodes have already responded
then we are happy to call the request complete. 
>>> 
>>> Cheers
>>> 
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> New Zealand
>>> 
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 19/02/2013, at 1:48 AM, Kanwar Sangha <kanwar@mavenir.com> wrote:
>>> 
>>>> Thanks Aaron.
>>>> 
>>>> Does the rpc_timeout not control the client timeout ? Is there any param
which is configurable to control the replication timeout between nodes ? Or the same param
is used to control that since the other node is also like a client ?
>>>> 
>>>> 
>>>> 
>>>> From: aaron morton [mailto:aaron@thelastpickle.com] 
>>>> Sent: 17 February 2013 11:26
>>>> To: user@cassandra.apache.org
>>>> Subject: Re: Mutation dropped
>>>> 
>>>> You are hitting the maximum throughput on the cluster. 
>>>> 
>>>> The messages are dropped because the node fails to start processing them
before rpc_timeout. 
>>>> 
>>>> However the request is still a success because the client requested CL was
achieved. 
>>>> 
>>>> Testing with RF 2 and CL 1 really just tests the disks on one local machine.
Both nodes replicate each row, and writes are sent to each replica, so the only thing the
client is waiting on is the local node to write to it's commit log. 
>>>> 
>>>> Testing with (and running in prod) RF3 and CL QUROUM is a more real world
scenario. 
>>>> 
>>>> Cheers
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> New Zealand
>>>> 
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 15/02/2013, at 9:42 AM, Kanwar Sangha <kanwar@mavenir.com> wrote:
>>>> 
>>>> 
>>>> Hi – Is there a parameter which can be tuned to prevent the mutations from
being dropped ? Is this logic correct ?
>>>> 
>>>> Node A and B with RF=2, CL =1. Load balanced between the two.
>>>> 
>>>> --  Address           Load       Tokens  Owns (effective)  Host ID      
                        Rack
>>>> UN  10.x.x.x       746.78 GB  256     100.0%            dbc9e539-f735-4b0b-8067-b97a85522a1a
 rack1
>>>> UN  10.x.x.x       880.77 GB  256     100.0%            95d59054-be99-455f-90d1-f43981d3d778
 rack1
>>>> 
>>>> Once we hit a very high TPS (around 50k/sec of inserts), the nodes start
falling behind and we see the mutation dropped messages. But there are no failures on the
client. Does that mean other node is not able to persist the replicated data ? Is there some
timeout associated with replicated data persistence ?
>>>> 
>>>> Thanks,
>>>> Kanwar
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> From: Kanwar Sangha [mailto:kanwar@mavenir.com] 
>>>> Sent: 14 February 2013 09:08
>>>> To: user@cassandra.apache.org
>>>> Subject: Mutation dropped
>>>> 
>>>> Hi – I am doing a load test using YCSB across 2 nodes in a cluster and
seeing a lot of mutation dropped messages.  I understand that this is due to the replica not
being written to the
>>>> other node ? RF = 2, CL =1.
>>>> 
>>>> From the wiki -
>>>> For MUTATION messages this means that the mutation was not applied to all
replicas it was sent to. The inconsistency will be repaired by Read Repair or Anti Entropy
Repair
>>>> 
>>>> Thanks,
>>>> Kanwar
>>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Mime
View raw message