From user-return-24559-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Mar 5 17:40:07 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AD9749546 for ; Mon, 5 Mar 2012 17:40:07 +0000 (UTC) Received: (qmail 68953 invoked by uid 500); 5 Mar 2012 17:40:05 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 68857 invoked by uid 500); 5 Mar 2012 17:40:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68848 invoked by uid 99); 5 Mar 2012 17:40:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Mar 2012 17:40:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a93.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Mar 2012 17:39:59 +0000 Received: from homiemail-a93.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a93.g.dreamhost.com (Postfix) with ESMTP id 7CA438405C for ; Mon, 5 Mar 2012 09:39:37 -0800 (PST) Received: from [172.16.1.3] (125-236-193-159.adsl.xtra.co.nz [125.236.193.159]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a93.g.dreamhost.com (Postfix) with ESMTPSA id 030D68406F for ; Mon, 5 Mar 2012 09:37:05 -0800 (PST) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1257) Content-Type: multipart/alternative; boundary="Apple-Mail=_F2D55DBC-35B2-4574-8F7F-5A68A39F09FD" Subject: Re: Mutation Dropped Messages Date: Tue, 6 Mar 2012 06:36:40 +1300 In-Reply-To: To: user@cassandra.apache.org References: <802222A2-ECF8-42FA-8767-2B9BBAFEBEAD@thelastpickle.com> Message-Id: <84113CB7-49F5-409D-B038-6568252A3A87@thelastpickle.com> X-Mailer: Apple Mail (2.1257) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_F2D55DBC-35B2-4574-8F7F-5A68A39F09FD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 > I increased the size of the cluster also the concurrent_writes = parameter. Still there is a node which keeps on dropping the mutation = messages. Ensure all the nodes have the same spec, and the nodes have the same = config. In a virtual environment consider moving the node. > Is this due to some improper load balancing?=20 What does nodetool ring say and what sort of queries (and RF and CL) are = you sending. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote: > Hey Aaron, > =20 > I increased the size of the cluster also the concurrent_writes = parameter. Still there is a node which keeps on dropping the mutation = messages. The other nodes are not dropping mutation messages. I am using = Hector API and had done nothing for load balancing so far. Just provided = the host:port of the nodes in the Cassandrahostconfig. Is this due to = some improper load balancing? Also the physical host where the node is = hosted is relatively heavier than other nodes=92 host. What can I do to = improve? > PS: The node is seed of the cluster. > =20 > Thanks, > Dushyant > =20 > From: aaron morton [mailto:aaron@thelastpickle.com]=20 > Sent: Monday, March 05, 2012 4:15 PM > To: user@cassandra.apache.org > Subject: Re: Mutation Dropped Messages > =20 > 1. Which parameters to tune in the config files? =96 Especially = looking for heavy writes > The node is overloaded. It may be because there are no enough nodes, = or the node is under temporary stress such as GC or repair.=20 > If you have spare IO / CPU capacity you could increase the = current_writes to increase throughput on the write stage. You then need = to ensure the commit log and, to a lesser degree, the data volumes can = keep up.=20 > =20 > 2. What is the difference between TimedOutException and silently = dropping mutation messages while operating on a CL of QUORUM. > TimedOutExceptions means CL nodes did not respond to the coordinator = before rpc_timeout. Dropping messages happens when a message is removed = from the queue in the a thread pool after rpc_timeout has occurred. it = is a feature of the architecture, and correct behaviour under stress.=20 > Inconsistencies created by dropped messages are repaired via reads as = high CL, HH (in 1.+), Read Repair or Anti Entropy. > =20 > Cheers > =20 > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > =20 > On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote: >=20 >=20 > Hi All, > =20 > While benchmarking Cassandra I found =93Mutation Dropped=94 messages = in the logs. Now I know this is a good old question. It will be really = great if someone can provide a check list to recover when such a thing = happens. I am looking for answers of the following questions - > =20 > 1. Which parameters to tune in the config files? =96 Especially = looking for heavy writes > 2. What is the difference between TimedOutException and silently = dropping mutation messages while operating on a CL of QUORUM. > =20 > =20 > Regards, > Dushyant > NOTICE: Morgan Stanley is not acting as a municipal advisor and the = opinions or views contained herein are not intended to be, and do not = constitute, advice within the meaning of Section 975 of the Dodd-Frank = Wall Street Reform and Consumer Protection Act. If you have received = this communication in error, please destroy all electronic and paper = copies and notify the sender immediately. Mistransmission is not = intended to waive confidentiality or privilege. Morgan Stanley reserves = the right, to the extent permitted under applicable law, to monitor = electronic communications. This message is subject to terms available at = the following link: http://www.morganstanley.com/disclaimers. If you = cannot access these links, please notify us by reply message and we will = send the contents to you. By messaging with Morgan Stanley you consent = to the foregoing. > =20 > NOTICE: Morgan Stanley is not acting as a municipal advisor and the = opinions or views contained herein are not intended to be, and do not = constitute, advice within the meaning of Section 975 of the Dodd-Frank = Wall Street Reform and Consumer Protection Act. If you have received = this communication in error, please destroy all electronic and paper = copies and notify the sender immediately. Mistransmission is not = intended to waive confidentiality or privilege. Morgan Stanley reserves = the right, to the extent permitted under applicable law, to monitor = electronic communications. This message is subject to terms available at = the following link:http://www.morganstanley.com/disclaimers. If you = cannot access these links, please notify us by reply message and we will = send the contents to you. By messaging with Morgan Stanley you consent = to the foregoing. --Apple-Mail=_F2D55DBC-35B2-4574-8F7F-5A68A39F09FD Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252
I increased the size of the = cluster also the concurrent_writes parameter. Still there is a node = which keeps on dropping the mutation = messages.
Ensure all the nodes have the same = spec, and the nodes have the same config. In a virtual environment = consider moving the node.

Is this due to some improper load = balancing? 
What does nodetool ring say = and what sort of queries (and RF and CL) are you = sending.

Cheers

http://www.thelastpickle.com

On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote:

Hey = Aaron,
I = increased the size of the cluster also the concurrent_writes parameter. = Still there is a node which keeps on dropping the mutation messages. The = other nodes are not dropping mutation messages. I am using Hector API = and had done nothing for load balancing so far. Just provided the = host:port of the nodes in the Cassandrahostconfig. Is this due to some = improper load balancing? Also the physical host where the node is hosted = is relatively heavier than other nodes=92 host. What can I do to = improve?
PS: = The node is seed of the cluster.
From: aaron morton = [mailto:aaron@thelastpickle.com] 
Sent: Monday, March 05, 2012 4:15 = PM
To: user@cassandra.apache.orgSubject: Re: = Mutation Dropped Messages
 
1.Which parameters to tune in the config files? =96 Especially = looking for heavy = writes
The node is overloaded. It may be because there are no = enough nodes, or the node is under temporary stress such as GC or = repair. 
If you have spare IO / = CPU capacity you could increase the current_writes to increase = throughput on the write stage. You then need to ensure the commit log = and, to a lesser degree, the data volumes can keep = up. 
2.What is the difference between TimedOutException and silently = dropping mutation messages while operating on a CL of = QUORUM.
TimedOutExceptions means CL nodes did not respond to = the coordinator before rpc_timeout. Dropping messages happens when a = message is removed from the queue in the a thread pool after rpc_timeout = has occurred. it is a feature of the architecture, and correct behaviour = under stress. 
Aaron = Morton
Freelance = Developer
@aaronmorton
NOTICE: Morgan Stanley is not = acting as a municipal advisor and the opinions or views contained herein = are not intended to be, and do not constitute, advice within the meaning = of Section 975 of the Dodd-Frank Wall Street Reform and Consumer = Protection Act. <= /font>If you have received this = communication in error, please destroy all electronic and paper copies = and notify the sender immediately. Mistransmission is not intended to = waive confidentiality or privilege. Morgan Stanley reserves the right, = to the extent permitted under applicable law, to monitor electronic = communications. This message is subject to terms available at the = following link:http://www.morganstanley.com/disclaimers. If you cannot access these links, please = notify us by reply message and we will send the contents to you. By = messaging with Morgan Stanley you consent to the = foregoing.

= --Apple-Mail=_F2D55DBC-35B2-4574-8F7F-5A68A39F09FD--