Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE3BE427C for ; Fri, 17 Jun 2011 07:20:55 +0000 (UTC) Received: (qmail 55422 invoked by uid 500); 17 Jun 2011 07:20:53 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 55396 invoked by uid 500); 17 Jun 2011 07:20:53 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 55388 invoked by uid 99); 17 Jun 2011 07:20:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 07:20:53 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a54.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 07:20:44 +0000 Received: from homiemail-a54.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a54.g.dreamhost.com (Postfix) with ESMTP id 1ED9F3A4065 for ; Fri, 17 Jun 2011 00:20:21 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=OWDbI8QPOf +Zn2n3R/7ZTCKf2rxNlHI6ozcZg9sUGPC34+H8TNVseAmQYJ83wL3/S8eDbMiqss FkU4uiE2/05oZu/ETlFF7RtKbeJa3FIa3z42e7NhFRNVZFf8L44Zn4dcs6UM+SY9 ZQCaILs9hkwlilUmR9RHTzVMT0VtlZmZw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=bDZUJCdhlGxmHhdv SlrBrqqX9fI=; b=NiJw8bhaK8wBMbEEhNHD/FrOZKWjTjbv0rFIg0mH1Z2lS6xc QZRV29ALPaPuYVXZpZT+6ARAHptaia9S6bwxsyaAI86DF4o/iwkah45gc0piUvGA BFsI66XrfchRiIXDgRBBAdBmuZXVUJ3auXiQ0qhnNjNnKNRnYvZa7Yk7P/s= Received: from [10.0.1.151] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a54.g.dreamhost.com (Postfix) with ESMTPSA id 5279C3A4061 for ; Fri, 17 Jun 2011 00:20:20 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-22--724463699 Subject: Re: Easy way to overload a single node on purpose? Date: Fri, 17 Jun 2011 19:20:17 +1200 In-Reply-To: To: user@cassandra.apache.org References: <1F398D0F-EA23-4C63-B0D9-751A59F694D8@thelastpickle.com> Message-Id: <7AB3F720-3418-470F-B400-95249151C45B@thelastpickle.com> X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-22--724463699 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii The short answer to the problem you saw is monitor the disk space. Also = monitor client side logs for errors. Running out of commit log space = does not stop the node from doing reads, so it can still be considered = up.=20 One nodes view of it's own UP'ness is not as important as the other = nodes (or clients) view of it. For example... A node will appear UP in the ring view of another node if it is = participating in gossip messages and it's application state is normal. = But a node will appear UP in it's own view of the ring most of time = (assuming not bootstrap, leaving etc and it has joined the ring). This = applies even if it's gossip service has been disabled. To a client a node will appear down if it is not responding to RPC = requests. But it could still be part of the cluster, appear UP to other = nodes and be responding to read and/or write.=20 So to monitor that a node is running in some form you can... - you should be monitoring the TP stats anyway, so you know the node is = in some running state=20 - check that you can connect as a client to each node and do some simple = call. Either read/write or describe_ring() which will exec locallay or = describe_schema_versions() which will call all live nodes. A read/write = will only verify that the node can act as a coordinator, not that it can = read/write it's self.=20 - monitor the other nodes view of each node using nodetool ring.=20 Now that i've written that I'm not 100% sold on it, but it will do for = now :) =20 Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 17 Jun 2011, at 10:25, Suan Aik Yeo wrote: > > Having a ping column can work if every key is replicated to every = node. It would tell you the cluster is working, sort of. Once the number = of nodes is greater than the RF, it tells you a subset of the nodes = works. >=20 > The way our check works is that each node checks itself, so in this = context we're not concerned about whether the cluster is "up", but that = each individual node is "up". > =20 > So the symptoms I saw, the node actually going "down" etc, were = probably due to many different events happening at the time, and will be = very hard to recreate? >=20 > On Thu, Jun 16, 2011 at 6:16 AM, aaron morton = wrote: > > DEBUG 14:36:55,546 ... timed out >=20 > Is logged when the coordinator times out waiting for the replicas to = respond, the timeout setting is rpc_timeout in the yaml file. This = results in the client getting a TimedOutException. >=20 > AFAIK There is no global everything is good / bad flags to check. e.g. = AFAIK I node will not mark its self down if it runs out of disk space. = So you need to monitor the free disk space and alert on that. >=20 > Having a ping column can work if every key is replicated to every = node. It would tell you the cluster is working, sort of. Once the number = of nodes is greater than the RF, it tells you a subset of the nodes = works. >=20 > If you google around you'll find discussions about monitoring with = munin, ganglia, cloud kick and Ops Centre. >=20 > If you install mx4j you can access the JMX metrics via HTTP, >=20 > Cheers >=20 > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com >=20 > On 16 Jun 2011, at 10:38, Suan Aik Yeo wrote: >=20 > > Here's a weird one... what's the best way to get a Cassandra node = into a "half-crashed" state? > > > > We have a 3-node cluster running 0.7.5. A few days ago this happened = organically to node1 - the partition the commitlog was on was 100% full = and there was a "No space left on device" error, and after a while, = although the cluster and node1 was still up, to the other nodes it was = down, and messages like: > > DEBUG 14:36:55,546 ... timed out > > started to show up in its debug logs. > > > > We have a tool to indicate to the load balancer that a Cassandra = node is down, but it didn't detect it that time. Now I'm having trouble = purposefully getting the node back to that state, so that I can try = other monitoring methods. I've tried to fill up the commitlog partition = with other files, and although I get the "No space left on device" = error, the node still doesn't go down and show the other symptoms it = showed before. > > > > Also, if anyone could recommend a good way for a node itself to = detect that its in such a state I'd be interested in that too. Currently = what we're doing is making a "describe_cluster_name()" thrift call, but = that still worked when the node was "down". I'm thinking of something = like reading/writing to a fixed value in a keyspace as a check... = Unfortunately Java-based solutions are out of the question. > > > > > > Thanks, > > Suan >=20 >=20 --Apple-Mail-22--724463699 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
http://www.thelastpickle.com

On 17 Jun 2011, at 10:25, Suan Aik Yeo wrote:

> = Having a ping column can work if every key is replicated to every node. = It would tell you the cluster is working, sort of. Once the number of = nodes is greater than the RF, it tells you a subset of the nodes = works.

The way our check works is that each node checks itself, = so in this context we're not concerned about whether the cluster is = "up", but that each individual node is "up".
 
So the symptoms I saw, the node actually going "down" etc, were probably = due to many different events happening at the time, and will be very = hard to recreate?

On Thu, Jun 16, 2011 = at 6:16 AM, aaron morton <aaron@thelastpickle.com> wrote:
>     DEBUG 14:36:55,546 ... = timed out

Is logged when the coordinator times out waiting for the replicas = to respond, the timeout setting is rpc_timeout in the yaml file. This = results in the client getting a TimedOutException.

AFAIK There is no global everything is good / bad flags to check. e.g. = AFAIK I node will not mark its self down if it runs out of disk space. =  So you need to monitor the free disk space and alert on that.

Having a ping column can work if every key is replicated to every node. = It would tell you the cluster is working, sort of. Once the number of = nodes is greater than the RF, it tells you a subset of the nodes = works.

If you google around you'll find discussions about monitoring with = munin, ganglia, cloud kick and Ops Centre.

If you install mx4j you can access the JMX metrics via HTTP,

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 16 Jun 2011, at 10:38, Suan Aik Yeo wrote:

> Here's a weird one... what's the best way to get a Cassandra node = into a "half-crashed" state?
>
> We have a 3-node cluster running 0.7.5. A few days ago this = happened organically to node1 - the partition the commitlog was on was = 100% full and there was a "No space left on device" error, and after a = while, although the cluster and node1 was still up, to the other nodes = it was down, and messages like:
>     DEBUG 14:36:55,546 ... timed out
> started to show up in its debug logs.
>
> We have a tool to indicate to the load balancer that a Cassandra = node is down, but it didn't detect it that time. Now I'm having trouble = purposefully getting the node back to that state, so that I can try = other monitoring methods. I've tried to fill up the commitlog partition = with other files, and although I get the "No space left on device" = error, the node still doesn't go down and show the other symptoms it = showed before.
>
> Also, if anyone could recommend a good way for a node itself to = detect that its in such a state I'd be interested in that too. Currently = what we're doing is making a "describe_cluster_name()" thrift call, but = that still worked when the node was "down". I'm thinking of something = like reading/writing to a fixed value in a keyspace as a check... = Unfortunately Java-based solutions are out of the question.
>
>
> Thanks,
> Suan



= --Apple-Mail-22--724463699--