Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 09A83E84F for ; Wed, 20 Feb 2013 17:57:09 +0000 (UTC) Received: (qmail 95171 invoked by uid 500); 20 Feb 2013 17:57:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 95136 invoked by uid 500); 20 Feb 2013 17:57:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 95127 invoked by uid 99); 20 Feb 2013 17:57:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Feb 2013 17:57:06 +0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLYTO_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.139.213.79] (HELO nm13-vm0.bullet.mail.bf1.yahoo.com) (98.139.213.79) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 20 Feb 2013 17:56:59 +0000 Received: from [98.139.212.151] by nm13.bullet.mail.bf1.yahoo.com with NNFMP; 20 Feb 2013 17:56:37 -0000 Received: from [98.139.215.248] by tm8.bullet.mail.bf1.yahoo.com with NNFMP; 20 Feb 2013 17:56:37 -0000 Received: from [127.0.0.1] by omp1061.mail.bf1.yahoo.com with NNFMP; 20 Feb 2013 17:56:37 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 416206.65454.bm@omp1061.mail.bf1.yahoo.com Received: (qmail 68951 invoked by uid 60001); 20 Feb 2013 17:56:37 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1361382997; bh=kHdd9/huZ2xcaB+j8qJ3aGXddOWqcncLhLupX0lmZ5A=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=hTM89nZ2W86w6ZBUU9LWUacpLUgx/OgfEa/sOllqRPx6J9mQOu7UBQT4JGWCHCSo/QMGXR41Lr/ZU3IJhiYhlUY8Wj1f34trvlqZfSzfIaI6EyW6PE9beeK/Pxgbsntl1eJjazqWI43Oe3c3ck77B545jYXcyQtDqmiv+R7/5dw= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=ea6vOYSKb0Et9969NBckYUOWfn3YALCx/Ww5EkN5W7wHUBnkmrWG8TwuWpTi9k4weLxZ2Eo4T481Cf4FvpkheyJnSFwmyI68FrR7FLOLgODwI1EktfRp8IGYTEbB88D6bC3j+tbBZHVM4ZiuiGe4dX2SEnzr6wqHgbG7Y7Qptaw=; X-YMail-OSG: 17zKT.UVM1n89QimLUAK6fo5Uuyik8OsRUE_SumDBUCqfpS Uy5QpJzUmo6._4rLNvoRCYZ0IOhHyaANzSuXNb43pTPT3fRiq8cTqDj8u08f AxCw1YbVSVeqAgG3R0V.BM0n1cqtJ0h5NCfda_w5a4dCYm1YRVjTqmjC82h1 UvMX2EE6IbzlslGNYrtQwtphzbBQprVglgmn3oTpu00qOpF0TZK7Lk5hod79 249_sccM3UGda0dXSzJX8iViOw9WCxk3FgfaPf82NN1P0HH2xXr2HeX3L8Vu I0nJkvE6MzbE4JGFw4nW1drbGEzeMyZ45YqOXWfXBTPOI46gHKHpbpIzh.Ru y.fHAO4vRGfSQ1RIC1tF2OMK8AnbfX937dIAOb5O311X_A_TkAUI4OVXysip oWgH.X2WNJ.zDhFiVRHjw14j0DaVtBqyKgzphsnuVzW17JCQfpuQ5YpLrTQH blWiI1XaEedSJBrfgQ93lXGNdXH8nqwi1Xz0hWJMfD74FVLxCTvzbJbEBdFk 8NuBYAOv1eYbsAekHnnl1U8x.YgOhHW0QJHNofs_zeh4.E7H44Sznce_7dRh vELVICs5.4A8eejOGI8Mobjai Received: from [208.185.20.30] by web160902.mail.bf1.yahoo.com via HTTP; Wed, 20 Feb 2013 09:56:37 PST X-Rocket-MIMEInfo: 001.001,V2hhdCBkb2VzIHJwY190aW1lb3V0IGNvbnRyb2w_IE9ubHkgdGhlIHJlYWRzL3dyaXRlcz8gSG93IGFib3V0IG90aGVyIGludGVyLW5vZGUgY29tbXVuaWNhdGlvbiwgbGlrZSBkYXRhIHN0cmVhbSwgbWVya2xlIHRyZWUgcmVxdWVzdD8gwqBXaGF0IGlzIHRoZSByZWFzb25hYmxlIHZhbHVlIGZvciByb2NfdGltZW91dD8gVGhlIGRlZmF1bHQgdmFsdWUgb2YgMTAgc2Vjb25kcyBhcmUgd2F5IHRvbyBsb25nLiBXaGF0IGlzIHRoZSBzaWRlIGVmZmVjdCBpZiBpdCdzIHNldCB0byBhIHJlYWxseSBzbWFsbCBudW0BMAEBAQE- X-Mailer: YahooMailWebService/0.8.134.513 References: <57C7C3CBDCB04F45A57AEC4CB21C0CCD1DB31C54@mbx024-e1-nj-6.exch024.domain.local> <57C7C3CBDCB04F45A57AEC4CB21C0CCD1DB31D68@mbx024-e1-nj-6.exch024.domain.local> <11148983-2781-4442-ADCB-9E601B259600@thelastpickle.com> <57C7C3CBDCB04F45A57AEC4CB21C0CCD1DB3245E@mbx024-e1-nj-6.exch024.domain.local> <04E3681D-F556-4D79-97AD-FE219ADDF1A5@thelastpickle.com> Message-ID: <1361382997.63074.YahooMailNeo@web160902.mail.bf1.yahoo.com> Date: Wed, 20 Feb 2013 09:56:37 -0800 (PST) From: Wei Zhu Reply-To: Wei Zhu Subject: Re: Mutation dropped To: "user@cassandra.apache.org" In-Reply-To: <04E3681D-F556-4D79-97AD-FE219ADDF1A5@thelastpickle.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="498176207-1142399933-1361382997=:63074" X-Virus-Checked: Checked by ClamAV on apache.org --498176207-1142399933-1361382997=:63074 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable What does rpc_timeout control? Only the reads/writes? How about other inter= -node communication, like data stream, merkle tree request? =C2=A0What is t= he reasonable value for roc_timeout? The default value of 10 seconds are wa= y too long. What is the side effect if it's set to a really small number, s= ay 20ms?=0A=0AThanks.=0A-Wei=0A=0A=0A________________________________=0A Fr= om: aaron morton =0ATo: user@cassandra.apache.org = =0ASent: Tuesday, February 19, 2013 7:32 PM=0ASubject: Re: Mutation dropped= =0A =0A=0ADoes the rpc_timeout not control the client timeout ?No it is how= long a node will wait for a response from other nodes before raising a Tim= edOutException if less than CL nodes have responded.=C2=A0=0ASet the client= side socket timeout using your preferred client.=C2=A0=0A=0AIs there any p= aram which is configurable to control the replication timeout between nodes= ?There is no such thing.=0Arpc_timeout is roughly like that, but it's not = right to think about it that way.=C2=A0=0Ai.e. if a message to a replica ti= mes out and CL nodes have already responded then we are happy to call the r= equest complete.=C2=A0=0A=0ACheers=0A=0A=C2=A0=0A=0A-----------------=0AAar= on Morton=0AFreelance Cassandra Developer=0ANew Zealand=0A=0A@aaronmorton= =0Ahttp://www.thelastpickle.com=0A=0AOn 19/02/2013, at 1:48 AM, Kanwar Sang= ha wrote:=0A=0AThanks Aaron.=0A>=C2=A0=0A>Does the rpc= _timeout not control the client timeout ? Is there any param which is confi= gurable to control the replication timeout between nodes ? Or the same para= m is used to control that since the other node is also like a client ?=0A>= =C2=A0=0A>=C2=A0=0A>=C2=A0=0A>From:=C2=A0aaron morton [mailto:aaron@thelast= pickle.com]=C2=A0=0A>Sent:=C2=A017 February 2013 11:26=0A>To:=C2=A0user@cas= sandra.apache.org=0A>Subject:=C2=A0Re: Mutation dropped=0A>=C2=A0=0A>You ar= e hitting the maximum throughput on the cluster.=C2=A0=0A>=C2=A0=0A>The mes= sages are dropped because the node fails to start processing them before rp= c_timeout.=C2=A0=0A>=C2=A0=0A>However the request is still a success becaus= e the client requested CL was achieved.=C2=A0=0A>=C2=A0=0A>Testing with RF = 2 and CL 1 really just tests the disks on one local machine. Both nodes rep= licate each row, and writes are sent to each replica, so the only thing the= client is waiting on is the local node to write to it's commit log.=C2=A0= =0A>=C2=A0=0A>Testing with (and running in prod) RF3 and CL QUROUM is a mor= e real world scenario.=C2=A0=0A>=C2=A0=0A>Cheers=0A>=C2=A0=0A>-------------= ----=0A>Aaron Morton=0A>Freelance Cassandra Developer=0A>New Zealand=0A>=C2= =A0=0A>@aaronmorton=0A>http://www.thelastpickle.com=0A>=C2=A0=0A>On 15/02/2= 013, at 9:42 AM, Kanwar Sangha wrote:=0A>=0A>=0A>=0A>H= i =E2=80=93 Is there a parameter which can be tuned to prevent the mutation= s from being dropped ? Is this logic correct ?=0A>=C2=A0=0A>Node A and B wi= th RF=3D2, CL =3D1. Load balanced between the two.=0A>=C2=A0=0A>--=C2=A0 Ad= dress=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Load=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Tokens=C2=A0 Owns (effective)=C2=A0 Host = ID=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Rack=0A>UN=C2=A0 10.x.x.x=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 746.78 GB=C2=A0 256=C2=A0=C2=A0=C2=A0=C2=A0 100.0%=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 dbc9e539-f7= 35-4b0b-8067-b97a85522a1a=C2=A0 rack1=0A>UN=C2=A0 10.x.x.x=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 880.77 GB=C2=A0 256=C2=A0=C2=A0=C2=A0=C2=A0 100.0%=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 95d59054-be= 99-455f-90d1-f43981d3d778=C2=A0 rack1=0A>=C2=A0=0A>Once we hit a very high = TPS (around 50k/sec of inserts), the nodes start falling behind and we see = the mutation dropped messages. But there are no failures on the client. Doe= s that mean other node is not able to persist the replicated data ? Is ther= e some timeout associated with replicated data persistence ?=0A>=C2=A0=0A>T= hanks,=0A>Kanwar=0A>=C2=A0=0A>=C2=A0=0A>=C2=A0=0A>=C2=A0=0A>=C2=A0=0A>=C2= =A0=0A>=C2=A0=0A>From:=C2=A0Kanwar Sangha [mailto:kanwar@mavenir.com]=C2=A0= =0A>Sent:=C2=A014 February 2013 09:08=0A>To:=C2=A0user@cassandra.apache.org= =0A>Subject:=C2=A0Mutation dropped=0A>=C2=A0=0A>Hi =E2=80=93 I am doing a l= oad test using YCSB across 2 nodes in a cluster and seeing a lot of mutatio= n dropped messages.=C2=A0 I understand that this is due to the replica not = being written to the=0A>other node ? RF =3D 2, CL =3D1.=0A>=C2=A0=0A>From t= he wiki -=0A>For MUTATION messages this means that the mutation was not app= lied to all replicas it was sent to. The inconsistency will be repaired by = Read Repair or Anti Entropy Repair=0A>=C2=A0=0A>Thanks,=0A>Kanwar=0A>=C2=A0 --498176207-1142399933-1361382997=:63074 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
What does rpc_timeout contr= ol? Only the reads/writes? How about other inter-node communication, like d= ata stream, merkle tree request?  What is the reasonable value for roc= _timeout? The default value of 10 seconds are way too long. What is the sid= e effect if it's set to a really small number, say 20ms?

= Thanks.
-Wei

=
From: aaron morton <aaron@thelastpickle.com>
To: user@cassandra.apache.or= g
Sent: Tuesday, Febr= uary 19, 2013 7:32 PM
Subject: Re: Mutation dropped

=0A
Does the rpc_timeout not control the client timeout ?
<= /div>
No it is how long a node will wait for a response f= rom other nodes before raising a TimedOutException if less than CL nodes ha= ve responded. 
Set the client side socket timeout using your prefe= rred client. 

Is there any param which is configurable t= o control the replication timeout between nodes ?
There is no such thing.
rpc_timeout is roughly like th= at, but it's not right to think about it that way. 
i.e. if = a message to a replica times out and CL nodes have already responded then w= e are happy to call the request complete. 

Ch= eers

 
=0A
-----------------
Aaron Mor= ton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com
=0A
=0A=0A
=
On 19/02/2013, at 1:48 AM, Kanwar Sangha <kanwar@mavenir.com> wrote:

Thanks Aaron.=
 
Does the rpc_timeout not control the client timeou= t ? Is there any param which is configurable to control the replication tim= eout between nodes ? Or the same param is used to control that since the ot= her node is also like a client ?
 
&nbs= p;
 
F= rom: aaro= n morton [mailto:aaron@thelastpickle.com] 
Sent: 17 February 2013 11:26
To: 
user@cassandra.apache.org
Subject:=  Re: Mut= ation dropped
 
<= div style=3D"margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: 'Times= New Roman', serif; ">You are hitting the maximum throughput on the cluster= . 
 
The messages are dropped because the node fails to start proce= ssing them before rpc_timeout. 
 
However the request is still= a success because the client requested CL was achieved. 
<= div>
 
T= esting with RF 2 and CL 1 really just tests the disks on one local machine.= Both nodes replicate each row, and writes are sent to each replica, so the= only thing the client is waiting on is the local node to write to it's com= mit log. 
 
Testing with (and running in prod) RF3 and CL QUR= OUM is a more real world scenario. 
Cheers
 
-------------= ----
Aaron Morton
Freelance Cassandra Developer
<= div>
New Zealand
 
 
<= div>
On 15/02/2013, at 9:42 AM, Kanwar Sangha <kanwar@mavenir.com> wrote:


Hi =E2=80= =93 Is there a parameter which can be tuned to prevent the mutations from b= eing dropped ? Is this logic correct ?
 
Node A and B with RF=3D2, CL =3D1. Load balanced between the two.
 =
--  Address =           Load  &nbs= p;    Tokens  Owns (effective)  Host ID  = ;            &n= bsp;            = ;    Rack
UN  10.x.x.x       746.78 GB  256     100.0%     &= nbsp;      dbc9e539-f735-4b0b-8067-b97a85522a1a&nb= sp; rack1
U= N  10.x.x.x       880.77 GB  256&nb= sp;    100.0%        = ;    95d59054-be99-455f-90d1-f43981d3d778  rack1= =
 
Once we hit a very hig= h TPS (around 50k/sec of inserts), the nodes start falling behind and we se= e the mutation dropped messages. But there are no failures on the client. D= oes that mean other node is not able to persist the replicated data ? Is th= ere some timeout associated with replicated data persistence ?
=
 
Thanks,= =
Kanwar
 <= span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; "><= /div>
 
 
 
 
 
 
From: Kanwar Sangha [mailto:kanwar@mavenir.com] 
Sent: 14 February 2013 09:08
To:<= span class=3D"yiv2060671297apple-converted-space"> 
user@cassandra.apache.org
Subject: Mutation dropped
 
= Hi =E2=80=93 I am doing a load test using YCSB across 2 nodes in a cluster = and seeing a lot of mutation dropped messages.  I understand that this= is due to the replica not being written to the
other node ? RF =3D 2, CL =3D1.
 
Fo= r MUTATION messages this means that the mutation was not applied to all rep= licas it was sent to. The inconsistency will be repaired by Read Repair or = Anti Entropy Repair
 = =
Thanks,
Kanwar



--498176207-1142399933-1361382997=:63074--