Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 67AEF10FB4 for ; Tue, 10 Feb 2015 02:42:40 +0000 (UTC) Received: (qmail 86037 invoked by uid 500); 10 Feb 2015 02:42:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 85995 invoked by uid 500); 10 Feb 2015 02:42:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 85985 invoked by uid 99); 10 Feb 2015 02:42:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2015 02:42:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cheng.ren@bloomreach.com designates 209.85.212.182 as permitted sender) Received: from [209.85.212.182] (HELO mail-wi0-f182.google.com) (209.85.212.182) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2015 02:42:07 +0000 Received: by mail-wi0-f182.google.com with SMTP id n3so21878149wiv.3 for ; Mon, 09 Feb 2015 18:40:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bloomreach.com; s=google; h=mime-version:date:message-id:subject:from:to:content-type; bh=zKxEY+RKyBK3Q6/DVWlSTSWgQrehGr5sHLpHh8gO3vc=; b=e8/GjVzW4tVuULoOgB1Yq5anXR2XnHtSpHTCqze2UFrK5qmbXhXJy7u28LL6ibQ7qv B0WSd2L4UHB2fuMNX7Zim1FQbUhb/uNZb8ebYy6O4WGbC+DGZE32cy3chOApcGIqBInJ Z9zEeOwU9ioElrN3uKM57ha4xq4ox+1ULIeC0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=zKxEY+RKyBK3Q6/DVWlSTSWgQrehGr5sHLpHh8gO3vc=; b=jHQSeD+PBIsjeG+0a8ehEm47qLJRPtVchu10JLqCrJEGJ0Mj9AS4aN6ytBEsG5qt2a iIb/QFd4+wOndzZrHwBIvqt9AhFEvJyNLHDoo6/ArKnGf/If00QfyveEdLyCcScJ16bB FDq8mROboTdAzKCRRW6WaHuRiAV8q3U9HdPoMUZgm0WFY2iF0VhFQ9H38rYSKlKuRAl1 AysX3U8rqTkn7oB/86h37u3jTrNI06cpnctLn2bj0JJ8RO9MmpefBts85kX+tA/hHLFc cSHdztWqPOJdRJzVN3hp4TzHuBrUoPEizx0xNehBuhGPtNgyMi98eMf8MUN5BcTu34VO vXag== X-Gm-Message-State: ALoCoQmP6+pDlqtVS/WATtn322SRkKO+NqcDVHtHlZygnQB0SfIRUARumLtYzj+BRXLzZ+3DjD6j MIME-Version: 1.0 X-Received: by 10.194.95.66 with SMTP id di2mr47645396wjb.57.1423536035994; Mon, 09 Feb 2015 18:40:35 -0800 (PST) Received: by 10.216.116.67 with HTTP; Mon, 9 Feb 2015 18:40:35 -0800 (PST) Date: Mon, 9 Feb 2015 18:40:35 -0800 Message-ID: Subject: nodetool status shows large numbers of up nodes are down From: Cheng Ren To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bb0498ec035f3050eb2cf7b X-Virus-Checked: Checked by ClamAV on apache.org --047d7bb0498ec035f3050eb2cf7b Content-Type: text/plain; charset=UTF-8 Hi, We have a two-dc cluster with 21 nodes and 27 nodes in each DC. Over the past few months, we have seen nodetool status marks 4-8 nodes down while they are actually functioning. Particularly today we noticed that running nodetool status on some nodes shows higher number of nodes are down than before while they are actually up and serving requests. For example, on one node it shows 42 nodes are down. phi_convict_threshold of all nodes are set as 12, and we are running cassandra 2.0.4 on AWS EC2 machines. Does anyone have recommendation on identifying the root cause of this? Will this cause any consequences? Thanks, Cheng --047d7bb0498ec035f3050eb2cf7b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,
We have a two-dc cluster with 21 nodes and 27 node= s in each DC. Over the past few months, we have seen nodetool status marks = 4-8 nodes down while they are actually functioning. Particularly today we n= oticed that running nodetool status on some nodes shows higher number of no= des are down than before while they are actually up and serving requests.= =C2=A0
For example, on one node it shows 42 nodes are down.
=

phi_convict_threshold of all nodes are set as 12, and w= e are running cassandra 2.0.4 on AWS EC2 machines.

Does anyone have recommendation on identifying the root cause of this? Wil= l this cause any consequences?

Thanks,
C= heng=C2=A0
--047d7bb0498ec035f3050eb2cf7b--