Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC4AAD266 for ; Mon, 11 Mar 2013 13:52:47 +0000 (UTC) Received: (qmail 3988 invoked by uid 500); 11 Mar 2013 13:52:44 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3956 invoked by uid 500); 11 Mar 2013 13:52:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3944 invoked by uid 99); 11 Mar 2013 13:52:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Mar 2013 13:52:44 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a42.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Mar 2013 13:52:36 +0000 Received: from homiemail-a42.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTP id 2F33468C05F for ; Mon, 11 Mar 2013 06:52:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=3a3IT/+mYjEeg7YIFRRiVAj4cf g=; b=tMeZz9mQcCK6j9bRr+84+NBEdMKdl50vToRM5SX36Z93+B87GdeL5liajQ xGgHw0y+3geXyQQV/8r7lz58dr/RJJX4fr1TzcOZ2FBrYWENEIRkdSJ0RXioqmkP UfmmZCaJSOKyD31bLbaDzaUotv/Ud7b4MbSPLA8RrXFRwpZ84= Received: from [192.168.168.122] (c-98-234-52-29.hsd1.ca.comcast.net [98.234.52.29]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTPSA id EC4B968C058 for ; Mon, 11 Mar 2013 06:52:14 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_77F74001-D800-4A70-A373-8545397F71E8" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Nodetool drain automatically shutting down node? Date: Mon, 11 Mar 2013 06:52:13 -0700 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_77F74001-D800-4A70-A373-8545397F71E8 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Drain stops listening for connections from client and other nodes, and = flushes all the data to disk. The purpose is to get everything into = SSTables, so we do not want to process any more writes.=20 The error is logged at DEBUG as it's not important, just means a thread = (the processed gossip) was cancelled.=20 Cheers =20 ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 8/03/2013, at 7:58 PM, Andrew Bialecki = wrote: > If it's helps, here's the log with debug log statements. Possibly = issue with that exception? >=20 > INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:32,402 = StorageService.java (line 774) DRAINING: starting drain process > INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:32,403 = CassandraDaemon.java (line 218) Stop listening to thrift clients > INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:32,404 = Gossiper.java (line 1133) Announcing shutdown > DEBUG [GossipTasks:1] 2013-03-09 03:54:33,328 = DebuggableThreadPoolExecutor.java (line 190) Task cancelled > java.util.concurrent.CancellationException > at = java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:220) > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > at = org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.extractThrowa= ble(DebuggableThreadPoolExecutor.java:182) > at = org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.logExceptions= AfterExecute(DebuggableThreadPoolExecutor.java:146) > at = org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.afte= rExecute(DebuggableScheduledThreadPoolExecutor.java:50) > at = java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.= java:888) > at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :908) > at java.lang.Thread.run(Thread.java:662) > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,406 = StorageService.java (line 776) DRAINING: shutting down MessageService > INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,406 = MessagingService.java (line 534) Waiting for messaging service to = quiesce > INFO [ACCEPT-ip-10-116-111-143.ec2.internal/10.116.111.143] = 2013-03-09 03:54:33,407 MessagingService.java (line 690) = MessagingService shutting down server thread. > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,408 = StorageService.java (line 776) DRAINING: waiting for streaming > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,408 = StorageService.java (line 776) DRAINING: clearing mutation stage > DEBUG [Thread-5] 2013-03-09 03:54:33,408 Gossiper.java (line 221) = Reseting version for /10.83.55.44 > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,409 = StorageService.java (line 776) DRAINING: flushing column families > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,409 = ColumnFamilyStore.java (line 713) forceFlush requested but everything is = clean in Counter1 > DEBUG [Thread-6] 2013-03-09 03:54:33,410 Gossiper.java (line 221) = Reseting version for /10.80.187.124 > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,410 = ColumnFamilyStore.java (line 713) forceFlush requested but everything is = clean in Super1 > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,410 = ColumnFamilyStore.java (line 713) forceFlush requested but everything is = clean in SuperCounter1 > DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,410 = ColumnFamilyStore.java (line 713) forceFlush requested but everything is = clean in Standard1 > INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 03:54:33,510 = StorageService.java (line 774) DRAINED >=20 > On Fri, Mar 8, 2013 at 10:36 PM, Andrew Bialecki = wrote: > Hey all, >=20 > We're getting ready to upgrade our cluster to 1.2.2 from 1.1.5 and = we're testing the upgrade process on our dev cluster. We turned off all = client access to the cluster and then ran "nodetool drain" on the first = instance with the intention of running "nodetool snapshot" once it = finished. However, after running the drain, didn't see any errors, but = the Cassandra process was no longer running. Is that expected? =46rom = everything I've read it doesn't seem like it, but maybe I'm mistaken. >=20 > Here's the relevant portion of the log from that node (notice it says = it's shutting down the server thread in there): >=20 > INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 03:26:48,288 = StorageService.java (line 774) DRAINING: starting drain process > INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 03:26:48,288 = CassandraDaemon.java (line 218) Stop listening to thrift clients > INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 03:26:48,315 = Gossiper.java (line 1133) Announcing shutdown > INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 03:26:49,318 = MessagingService.java (line 534) Waiting for messaging service to = quiesce > INFO [ACCEPT-ip-10-116-111-143.ec2.internal/10.116.111.143] = 2013-03-09 03:26:49,319 MessagingService.java (line 690) = MessagingService shutting down server thread. > INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 03:26:49,338 = ColumnFamilyStore.java (line 659) Enqueuing flush of = Memtable-Counter1@177255852(14810190/60139556 serialized/live bytes, = 243550 ops) > INFO [FlushWriter:7] 2013-03-09 03:26:49,338 Memtable.java (line 264) = Writing Memtable-Counter1@177255852(14810190/60139556 serialized/live = bytes, 243550 ops) > INFO [FlushWriter:7] 2013-03-09 03:26:49,899 Memtable.java (line 305) = Completed flushing = /var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1-Counter1-he-104-Data.= db (15204741 bytes) for commitlog position = ReplayPosition(segmentId=3D1362797442799, position=3D27621115) > INFO [CompactionExecutor:11] 2013-03-09 03:26:49,900 = CompactionTask.java (line 109) Compacting = [SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace= 1-Counter1-he-102-Data.db'), = SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1= -Counter1-he-103-Data.db'), = SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1= -Counter1-he-104-Data.db'), = SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1= -Counter1-he-101-Data.db')] > INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 03:26:50,193 = StorageService.java (line 774) DRAINED >=20 > Thanks in advanced for any help. >=20 > Cheers, > Andrew >=20 --Apple-Mail=_77F74001-D800-4A70-A373-8545397F71E8 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 Drain = stops listening for connections from client and other nodes, and flushes = all the data to disk. The purpose is to get everything into SSTables, so = we do not want to process any more writes. 

The = error is logged at DEBUG as it's not important, just means a thread (the = processed gossip) was = cancelled. 

Cheers
 
http://www.thelastpickle.com

On 8/03/2013, at 7:58 PM, Andrew Bialecki <andrew.bialecki@gmail.com>= ; wrote:

If it's helps, here's the log with debug log statements. = Possibly issue with that exception?

INFO [RMI TCP = Connection(2)-10.116.111.143] 2013-03-09 03:54:32,402 = StorageService.java (line 774) DRAINING: starting drain process
 INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 = 03:54:32,403 CassandraDaemon.java (line 218) Stop listening to thrift = clients
 INFO [RMI TCP Connection(2)-10.116.111.143] = 2013-03-09 03:54:32,404 Gossiper.java (line 1133) Announcing = shutdown
DEBUG [GossipTasks:1] 2013-03-09 03:54:33,328 = DebuggableThreadPoolExecutor.java (line 190) Task = cancelled
java.util.concurrent.CancellationException
<= span class=3D"Apple-tab-span" style=3D"white-space:pre"> = at = java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:220)
= at = java.util.concurrent.FutureTask.get(FutureTask.java:83)
at = org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.extractThrowa= ble(DebuggableThreadPoolExecutor.java:182)
= at = org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.logExceptions= AfterExecute(DebuggableThreadPoolExecutor.java:146)
at = org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor.afte= rExecute(DebuggableScheduledThreadPoolExecutor.java:50)
= at = java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.= java:888)
at = java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= :908)
= at java.lang.Thread.run(Thread.java:662)
DEBUG [RMI TCP = Connection(2)-10.116.111.143] 2013-03-09 03:54:33,406 = StorageService.java (line 776) DRAINING: shutting down = MessageService
 INFO [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 = 03:54:33,406 MessagingService.java (line 534) Waiting for messaging = service to quiesce
 INFO = [ACCEPT-ip-10-116-111-143.ec2.internal/10.116.111.143] 2013-03-09 = 03:54:33,407 MessagingService.java (line 690) MessagingService shutting = down server thread.
DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 = 03:54:33,408 StorageService.java (line 776) DRAINING: waiting for = streaming
DEBUG [RMI TCP Connection(2)-10.116.111.143] = 2013-03-09 03:54:33,408 StorageService.java (line 776) DRAINING: = clearing mutation stage
DEBUG [Thread-5] 2013-03-09 03:54:33,408 Gossiper.java (line 221) = Reseting version for /10.83.55.44
DEBUG [RMI TCP = Connection(2)-10.116.111.143] 2013-03-09 03:54:33,409 = StorageService.java (line 776) DRAINING: flushing column families
DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 = 03:54:33,409 ColumnFamilyStore.java (line 713) forceFlush requested but = everything is clean in Counter1
DEBUG [Thread-6] 2013-03-09 = 03:54:33,410 Gossiper.java (line 221) Reseting version for /10.80.187.124
DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 = 03:54:33,410 ColumnFamilyStore.java (line 713) forceFlush requested but = everything is clean in Super1
DEBUG [RMI TCP = Connection(2)-10.116.111.143] 2013-03-09 03:54:33,410 = ColumnFamilyStore.java (line 713) forceFlush requested but everything is = clean in SuperCounter1
DEBUG [RMI TCP Connection(2)-10.116.111.143] 2013-03-09 = 03:54:33,410 ColumnFamilyStore.java (line 713) forceFlush requested but = everything is clean in Standard1
 INFO [RMI TCP = Connection(2)-10.116.111.143] 2013-03-09 03:54:33,510 = StorageService.java (line 774) DRAINED

On Fri, Mar 8, 2013 at 10:36 = PM, Andrew Bialecki <andrew.bialecki@gmail.com> wrote:
Hey = all,

We're getting ready to upgrade our cluster to = 1.2.2 from 1.1.5 and we're testing the upgrade process on our dev = cluster. We turned off all client access to the cluster and then ran = "nodetool drain" on the first instance with the intention of running = "nodetool snapshot" once it finished. However, after running the drain, = didn't see any errors, but the Cassandra process was no longer running. = Is that expected? =46rom everything I've read it doesn't seem like it, = but maybe I'm mistaken.

Here's the relevant portion of the log from that = node (notice it says it's shutting down the server thread in = there):

INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 = 03:26:48,288 StorageService.java (line 774) DRAINING: starting drain = process
 INFO [RMI TCP = Connection(38)-10.116.111.143] 2013-03-09 03:26:48,288 = CassandraDaemon.java (line 218) Stop listening to thrift clients
 INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 = 03:26:48,315 Gossiper.java (line 1133) Announcing = shutdown
 INFO [RMI TCP = Connection(38)-10.116.111.143] 2013-03-09 03:26:49,318 = MessagingService.java (line 534) Waiting for messaging service to = quiesce
 INFO [ACCEPT-ip-10-116-111-143.ec2.internal/10.116.111.143] = 2013-03-09 03:26:49,319 MessagingService.java (line 690) = MessagingService shutting down server thread.
 INFO [RMI TCP Connection(38)-10.116.111.143] = 2013-03-09 03:26:49,338 ColumnFamilyStore.java (line 659) Enqueuing = flush of Memtable-Counter1@177255852(14810190/60139556 serialized/live = bytes, 243550 ops)
 INFO [FlushWriter:7] 2013-03-09 03:26:49,338 = Memtable.java (line 264) Writing = Memtable-Counter1@177255852(14810190/60139556 serialized/live bytes, = 243550 ops)
 INFO [FlushWriter:7] 2013-03-09 = 03:26:49,899 Memtable.java (line 305) Completed flushing = /var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1-Counter1-he-104-Data.= db (15204741 bytes) for commitlog position = ReplayPosition(segmentId=3D1362797442799, position=3D27621115)
 INFO [CompactionExecutor:11] 2013-03-09 = 03:26:49,900 CompactionTask.java (line 109) Compacting = [SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace= 1-Counter1-he-102-Data.db'), = SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1= -Counter1-he-103-Data.db'), = SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1= -Counter1-he-104-Data.db'), = SSTableReader(path=3D'/var/lib/cassandra/data/Keyspace1/Counter1/Keyspace1= -Counter1-he-101-Data.db')]
 INFO [RMI TCP Connection(38)-10.116.111.143] 2013-03-09 = 03:26:50,193 StorageService.java (line 774) = DRAINED

Thanks in advanced for any = help.

Cheers,
Andrew


= --Apple-Mail=_77F74001-D800-4A70-A373-8545397F71E8--