Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 80CB6D1F8 for ; Wed, 22 May 2013 20:12:21 +0000 (UTC) Received: (qmail 67323 invoked by uid 500); 22 May 2013 20:12:21 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 67296 invoked by uid 500); 22 May 2013 20:12:21 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 67282 invoked by uid 99); 22 May 2013 20:12:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 May 2013 20:12:21 +0000 Date: Wed, 22 May 2013 20:12:21 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-5272) Hinted Handoff Throttle based on cluster size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5272: -------------------------------------- Attachment: 5272.txt Makes sense to me. Patch attached against 1.2. > Hinted Handoff Throttle based on cluster size > --------------------------------------------- > > Key: CASSANDRA-5272 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5272 > Project: Cassandra > Issue Type: Improvement > Components: Core > Affects Versions: 1.2.1 > Reporter: Rick Branson > Priority: Minor > Labels: lhf > Fix For: 2.0 > > Attachments: 5272.txt > > > For a 12-node EC2 m1.xlarge cluster, restarting a node causes it to get completely overloaded with the default 2-thread, 1024KB setting in 1.2.x. This seemed to be a smaller problem when it was 6-nodes, but still required us to abort handoffs. The old defaults in 1.1.x were WAY more conservative. I've dropped this way down to 128KB on our production cluster which is really conservative, but appears to have solved it. The default seems way too high on any cluster that is non-trivial in size. > After putting some thought to this, it seems that this should really be based on cluster size, making the throttle a "target" for how much write load a single node can swallow. As the cluster grows, the amount of hints that can be delivered by each other node in the cluster goes down, so the throttle should self-adjust to take that into account. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira