Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E71DAD284 for ; Mon, 10 Sep 2012 08:11:50 +0000 (UTC) Received: (qmail 41452 invoked by uid 500); 10 Sep 2012 08:11:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 41294 invoked by uid 500); 10 Sep 2012 08:11:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 41259 invoked by uid 99); 10 Sep 2012 08:11:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 08:11:45 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HK_RANDOM_ENVFROM,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of teddyyyy123@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 08:11:37 +0000 Received: by qcac10 with SMTP id c10so1049851qca.31 for ; Mon, 10 Sep 2012 01:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=DS+7i97MVyDcVEeLQ+qeeMDEyj8B9q8HO1Yhyjv54Gk=; b=A8ly0ISupsms90IYOCVLDt0MtqFmetV9Zc/AZmWUuv5rxU3xM8KdcOme0lQ7mfH7Ou 45R0qpdal1AWIt3KK1bnCZn8AT1Hv1rQuQA3WRqXZBaRABUWby/n6v17soriHNDGKy5q +ONIJGO7JTAjeeexRZ+PHjQLr0+ms06hTJU8mECpba2Fw3ERbZEIxrWTnU9LjkVZTFoC OBpKI9BV1kH6UOiqQxkjYU3EHElXWFsEwa23ZsNYr7z58W8514akMrjYXbBLDlC2aGz+ n7zos1gGHIIrO+I4imT32tJyMoOiIXcR5bwjjuECWx/lI60NB7n0AAiB7NcmaSLkcFT6 JrbA== Received: by 10.224.176.132 with SMTP id be4mr6552041qab.45.1347264676693; Mon, 10 Sep 2012 01:11:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.49.13.97 with HTTP; Mon, 10 Sep 2012 01:10:56 -0700 (PDT) From: Yang Date: Mon, 10 Sep 2012 01:10:56 -0700 Message-ID: Subject: replace_token code? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf303b430f7906fb04c95480c9 --20cf303b430f7906fb04c95480c9 Content-Type: text/plain; charset=ISO-8859-1 it looks that by specifying replace_token, the old owner is not removed from gossip (which I had thought it would do). Then it's understandable that the old owner would resurface later and we get some warning saying that the same token is owned by both. I ran an example with a 2-node cluster, with RF=2. host 10.72.201.80 was running for a while and had some data, then i shut it down, and booted up 10.190.221.204 with replace_token of the old token owned by the previous host. the following log sequence shows that the new host does acquire the token, but it does not at the same time remove 80 forcefully from gossip. instead, a few seconds later, it believed that .80 became live again. I don't have much understanding of the Gossip protocol, but roughly know that it's probability-based, looks we need an "assertive"/"NOW" membership control message for replace_token. thanks yang WARN [main] 2012-09-10 08:00:21,855 TokenMetadata.java (line 160) Token 166594924822352415786406422619018814804 changing ownership from / 10.72.201.80 to /10.190.221.204 INFO [main] 2012-09-10 08:00:21,855 StorageService.java (line 753) JOINING: Starting to bootstrap... INFO [CompactionExecutor:2] 2012-09-10 08:00:21,875 CompactionTask.java (line 109) Compacting [SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-1-Data.db'), SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-3-Data.db'), SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-4-Data.db'), SSTableReader(path='/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-2-Data.db')] INFO [CompactionExecutor:2] 2012-09-10 08:00:21,979 CompactionTask.java (line 221) Compacted to [/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-5-Data.db,]. 499 to 394 (~78% of original) bytes for 3 keys at 0.003997MB/s. Time: 94ms. INFO [Thread-4] 2012-09-10 08:00:22,070 StreamInSession.java (line 214) Finished streaming session 1 from /10.72.102.61 INFO [main] 2012-09-10 08:00:22,073 ColumnFamilyStore.java (line 643) Enqueuing flush of Memtable-LocationInfo@30624226(77/96 serialized/live bytes, 2 ops) INFO [FlushWriter:2] 2012-09-10 08:00:22,074 Memtable.java (line 266) Writing Memtable-LocationInfo@30624226(77/96 serialized/live bytes, 2 ops) INFO [FlushWriter:2] 2012-09-10 08:00:22,082 Memtable.java (line 307) Completed flushing /mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-6-Data.db (163 bytes) for commitlog position ReplayPosition(segmentId=6360937126265, position=3854) INFO [main] 2012-09-10 08:00:22,087 StorageService.java (line 1098) Node / 10.190.221.204 state jump to normal INFO [main] 2012-09-10 08:00:22,088 StorageService.java (line 666) Bootstrap/Replace/Move completed! Now serving reads. INFO [main] 2012-09-10 08:00:22,089 Mx4jTool.java (line 72) Will not load MX4J, mx4j-tools.jar is not in the classpath INFO [main] 2012-09-10 08:00:22,119 CassandraDaemon.java (line 124) Binding thrift service to /10.190.221.204:9160 INFO [main] 2012-09-10 08:00:22,122 CassandraDaemon.java (line 133) Using TFastFramedTransport with a max frame size of 15728640 bytes. INFO [main] 2012-09-10 08:00:22,125 CassandraDaemon.java (line 160) Using synchronous/threadpool thrift server on /10.190.221.204 : 9160 INFO [Thread-5] 2012-09-10 08:00:22,125 CassandraDaemon.java (line 212) Listening for thrift clients... INFO [RMI TCP Connection(4)-127.0.0.1] 2012-09-10 08:00:59,014 ColumnFamilyStore.java (line 643) Enqueuing flush of Memtable-Versions@11775580(128/160 serialized/live bytes, 3 ops) INFO [FlushWriter:2] 2012-09-10 08:00:59,016 Memtable.java (line 266) Writing Memtable-Versions@11775580(128/160 serialized/live bytes, 3 ops) INFO [FlushWriter:2] 2012-09-10 08:00:59,024 Memtable.java (line 307) Completed flushing /mnt/cassandra/data/system/Versions/system-Versions-hd-1-Data.db (256 bytes) for commitlog position ReplayPosition(segmentId=6360937126265, position=3854) INFO [GossipStage:1] 2012-09-10 08:01:23,036 Gossiper.java (line 850) Node /10.72.201.80 is now part of the cluster INFO [GossipStage:1] 2012-09-10 08:01:23,037 Gossiper.java (line 816) InetAddress /10.72.201.80 is now UP INFO [GossipStage:1] 2012-09-10 08:01:23,038 StorageService.java (line 1126) Nodes /10.72.201.80 and /10.190.221.204 have the same token 166594924822352415786406422619018814804. Ignoring /10.72.201.80 INFO [GossipTasks:1] 2012-09-10 08:01:32,967 Gossiper.java (line 830) InetAddress /10.72.201.80 is now dead. INFO [GossipTasks:1] 2012-09-10 08:01:53,976 Gossiper.java (line 644) FatClient /10.72.201.80 has been silent for 30000ms, removing from gossip --20cf303b430f7906fb04c95480c9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
it looks that by specifying replace_token, the old owner is not remove= d from gossip (which I had thought it would do).
Then it's un= derstandable that the old owner would resurface later and we get some warni= ng saying that the same token is owned by both.


I ran an example with a 2-node cluster, = with RF=3D2. =A0host=A010.72.201.80 was running for a while and had some da= ta, then i shut it down, and=A0
booted up=A010.190.221.204=A0with= replace_token of the old token owned by the previous host.
the following log sequence shows that the new host does acquire the to= ken, but it does not at the same time remove 80 forcefully from gossip.
instead, a few seconds later, it believed that .80 became live again= .
I don't have much understanding of the Gossip protocol, but roughl= y know that it's probability-based, looks we need an "assertive&qu= ot;/"NOW"=A0
membership control message for replace_tok= en.




thanks
yang


=A0WARN [main] 2012-09-= 10 08:00:21,855 TokenMetadata.java (line 160) Token 16659492482235241578640= 6422619018814804 changing ownership from /1= 0.72.201.80 to /10.190.221.204
=A0INFO [main] 2012-09-10 08:00:21,855 StorageService.java (line 753) = JOINING: Starting to bootstrap...
=A0INFO [CompactionExecutor:2] = 2012-09-10 08:00:21,875 CompactionTask.java (line 109) Compacting [SSTableR= eader(path=3D'/mnt/cassandra/data/system/LocationInfo/system-LocationIn= fo-hd-1-Data.db'), SSTableReader(path=3D'/mnt/cassandra/data/system= /LocationInfo/system-LocationInfo-hd-3-Data.db'), SSTableReader(path=3D= '/mnt/cassandra/data/system/LocationInfo/system-LocationInfo-hd-4-Data.= db'), SSTableReader(path=3D'/mnt/cassandra/data/system/LocationInfo= /system-LocationInfo-hd-2-Data.db')]
=A0INFO [CompactionExecutor:2] 2012-09-10 08:00:21,979 CompactionTask.= java (line 221) Compacted to [/mnt/cassandra/data/system/LocationInfo/syste= m-LocationInfo-hd-5-Data.db,]. =A0499 to 394 (~78% of original) bytes for 3= keys at 0.003997MB/s. =A0Time: 94ms.
=A0INFO [Thread-4] 2012-09-10 08:00:22,070 StreamInSession.java (line = 214) Finished streaming session 1 from /10.= 72.102.61
=A0INFO [main] 2012-09-10 08:00:22,073 ColumnFamily= Store.java (line 643) Enqueuing flush of Memtable-LocationInfo@30624226(77/= 96 serialized/live bytes, 2 ops)
=A0INFO [FlushWriter:2] 2012-09-10 08:00:22,074 Memtable.java (line 26= 6) Writing Memtable-LocationInfo@30624226(77/96 serialized/live bytes, 2 op= s)
=A0INFO [FlushWriter:2] 2012-09-10 08:00:22,082 Memtable.java = (line 307) Completed flushing /mnt/cassandra/data/system/LocationInfo/syste= m-LocationInfo-hd-6-Data.db (163 bytes) for commitlog position ReplayPositi= on(segmentId=3D6360937126265, position=3D3854)
=A0INFO [main] 2012-09-10 08:00:22,087 StorageService.java (line 1098)= Node /10.190.221.204 state jump to n= ormal
=A0INFO [main] 2012-09-10 08:00:22,088 StorageService.java = (line 666) Bootstrap/Replace/Move completed! Now serving reads.
=A0INFO [main] 2012-09-10 08:00:22,089 Mx4jTool.java (line 72) Will no= t load MX4J, mx4j-tools.jar is not in the classpath
=A0INFO [main= ] 2012-09-10 08:00:22,119 CassandraDaemon.java (line 124) Binding thrift se= rvice to /10.190.221.204:9160
=A0INFO [main] 2012-09-10 08:00:22,122 CassandraDaemon.java (line 133)= Using TFastFramedTransport with a max frame size of 15728640 bytes.
<= div>=A0INFO [main] 2012-09-10 08:00:22,125 CassandraDaemon.java (line 160) = Using synchronous/threadpool thrift server on /10.190.221.204 : 9160
=A0INFO [Thread-5] 2012-09-10 08:00:22,125 CassandraDaemon.java (line = 212) Listening for thrift clients...
=A0INFO [RMI TCP Connection(= 4)-127.0.0.1] 2012-09-10 08:00:59,014 ColumnFamilyStore.java (line 643) Enq= ueuing flush of Memtable-Versions@11775580(128/160 serialized/live bytes, 3= ops)
=A0INFO [FlushWriter:2] 2012-09-10 08:00:59,016 Memtable.java (line 26= 6) Writing Memtable-Versions@11775580(128/160 serialized/live bytes, 3 ops)=
=A0INFO [FlushWriter:2] 2012-09-10 08:00:59,024 Memtable.java (l= ine 307) Completed flushing /mnt/cassandra/data/system/Versions/system-Vers= ions-hd-1-Data.db (256 bytes) for commitlog position ReplayPosition(segment= Id=3D6360937126265, position=3D3854)
=A0INFO [GossipStage:1] 2012-09-10 08:01:23,036 Gossiper.java (li= ne 850) Node /10.72.201.80 is now part = of the cluster
=A0INFO [GossipStage:1] 2012-09-10 08:01:23,037 Go= ssiper.java (line 816) InetAddress /10.72.2= 01.80 is now UP
=A0INFO [GossipStage:1] 2012-09-10 08:01:23,038 StorageService.java (l= ine 1126) Nodes /10.72.201.80 and /10.190.221.204 have the same token 166594= 924822352415786406422619018814804. =A0Ignoring /10.72.201.80
=A0INFO [GossipTasks:1] 2012-09-10 08:01:32,967 Gossiper.java (line 83= 0) InetAddress /10.72.201.80 is now dea= d.
=A0INFO [GossipTasks:1] 2012-09-10 08:01:53,976 Gossiper.java = (line 644) FatClient /10.72.201.80 has = been silent for 30000ms, removing from gossip


--20cf303b430f7906fb04c95480c9--