Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F18B2EB04 for ; Tue, 12 Feb 2013 19:56:29 +0000 (UTC) Received: (qmail 50071 invoked by uid 500); 12 Feb 2013 19:56:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 50050 invoked by uid 500); 12 Feb 2013 19:56:27 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 50041 invoked by uid 99); 12 Feb 2013 19:56:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Feb 2013 19:56:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of boris.solovyov@gmail.com designates 209.85.214.194 as permitted sender) Received: from [209.85.214.194] (HELO mail-ob0-f194.google.com) (209.85.214.194) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Feb 2013 19:56:22 +0000 Received: by mail-ob0-f194.google.com with SMTP id ta14so121397obb.5 for ; Tue, 12 Feb 2013 11:56:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=7ai2bLOTTOpBzYSWykVmsH+dBhQC+2dBYvUZQ+cWZ9s=; b=L073E4xbQzQb5/GudZdFyYhV5MBEG+IWUenSjDTqa5Kj0MqpDcHhDv//6El58EDLha 1RvTQIa+qU0Q4rbRWgAicoAL64gyhn3ZO2CjoqlzYovX0JYjUGqVKpH5C1Y+3xrC9F3w YXX5xpf3O3Pwh+xovA9n3pxVTQv5WG5T5LtxkBNeolfTRFxC34H2BHixM8CoZL7O0UFo vDF4e40kH0Fof5vrFy4p0V2gOwa8/xipUis98JDpgKk3mZLyvujJ2c/7KI/+DY9dW02s dJAOjbSyb4yCw4J/GEp5/UKZFa6T3WDLkoGODNV4MO810AGf7y6F2Meir+5GOj3WHGCc P6qw== MIME-Version: 1.0 X-Received: by 10.60.11.8 with SMTP id m8mr13028013oeb.22.1360698962227; Tue, 12 Feb 2013 11:56:02 -0800 (PST) Received: by 10.76.91.135 with HTTP; Tue, 12 Feb 2013 11:56:02 -0800 (PST) Date: Tue, 12 Feb 2013 14:56:02 -0500 Message-ID: Subject: Nodetool doesn't shows two nodes From: Boris Solovyov To: user Content-Type: multipart/alternative; boundary=e89a8fb205484a311804d58c6a5c X-Virus-Checked: Checked by ClamAV on apache.org --e89a8fb205484a311804d58c6a5c Content-Type: text/plain; charset=ISO-8859-1 I've configured 2-node cluster in EC2, key settings as follows: cluster_name: 'TS' num_tokens: 256 seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: "ec2-23-21-11-193.compute-1.amazonaws.com, ec2-107-22-114-19.compute-1.amazonaws.com" listen_address: 10.145.232.190 broadcast_address: ec2-23-21-11-193.compute-1.amazonaws.com rpc_address: 0.0.0.0 endpoint_snitch: Ec2MultiRegionSnitch On other node, it is similar, but of course the listen and broadcast address are different. Now, when I start Cassandra, I see in the logs INFO 19:35:32,348 JOINING: waiting for ring information And then after 30 seconds, it says a bunch of things like this: JOINING: schema complete, ready to bootstrap JOINING: getting bootstrap token Enqueuing flush of Memtable... JOINING: sleeping 30000 ms for pending range setup JOINING: Starting to bootstrap... Bootstrap completed! for the tokens [....] Finally, after some more memtable flushing, INFO 19:36:32,710 Node /107.22.114.19 state jump to normal INFO 19:36:32,722 Startup completed! Now serving reads. Now, I start the other node, and I see basically the same thing in the logs. Running nodetool status, I see what looks like two single-node clusters! [root@ip-10-147-171-160 ~]# nodetool status Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 107.22.114.19 21 KB 256 100.0% f7a24bd2-8cb9-499d-806c-d9e548f34b8d 1a [root@ip-10-145-232-190 ~]# nodetool status Datacenter: us-east =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 23.21.11.193 21 KB 256 100.0% 9d70f022-03cf-488a-807d-22e991761483 1a It looks to me like nodes didn't communicate with each other like I thought they would, and timed out waiting for gossip to tell them which nodes are in the ring (I'm new to Cassandra, but this is my guess... certainly 30-second timeouts look suspicious). I checked with telnet, and from each node I can connect to port 7000 on the other node (both on internal and public IP). I feel like I have made a beginner's mistake. Anyone has a suggestion where to look next? - Boris --e89a8fb205484a311804d58c6a5c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I've configured 2-node cluster in EC2, key settings as= follows:

cluster_name: 'TS'
num_= tokens: 256
seed_provider:
=A0 =A0 - class_= name: org.apache.cassandra.locator.SimpleSeedProvider
=A0 =A0 =A0 parameters:
listen_address: 10.145.232.190
rpc_address: 0.0.0.0
endpoint_snitch: Ec2MultiRegionSnitch

<= div style>On other node, it is similar, but of course the listen and broadc= ast address are different. Now, when I start Cassandra, I see in the logs

INFO 19:35:32,348 JOINING: waiting for= ring information

And then after 30 seconds,= it says a bunch of things like this:

JOINING: schema complete, ready to bootstrap
JOINING: getti= ng bootstrap token
Enqueuing flush of Memtable...
JOINING: sleeping 30000 ms for pending range setup
JOINING: Starting to bootstrap...
Bootstrap completed! = for the tokens [....]

Finally, aft= er some more memtable flushing,

INFO 19:36:32,710 Node /107.22.114.19 = state jump to normal
INFO 19:36:32,722 Startup completed! Now ser= ving reads.

Now, I start the other node, and= I see basically the same thing in the logs.

Running nodetool status, I see what looks l= ike two single-node clusters!

[ro= ot@ip-10-147-171-160 ~]# nodetool status
Datacenter: us-east
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
St= atus=3DUp/Down
|/ State=3DNormal/Leaving/Joining/Moving
-- =A0Address =A0 =A0 =A0 =A0 =A0 Load =A0 =A0 =A0 Tokens =A0Owns =A0 Host= ID =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Rack
<= div>UN =A0107.22.114.19 =A0 =A0 21 KB =A0 =A0 =A0256 =A0 =A0 100.0% =A0f7a2= 4bd2-8cb9-499d-806c-d9e548f34b8d =A01a

[root@ip-10-145-232-190 ~]# nodetool status
<= div>Datacenter: us-east
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D
Status=3DUp/Down
|/ State=3DNormal/L= eaving/Joining/Moving
-- =A0Address =A0 =A0 =A0 =A0 =A0 Load =A0 = =A0 =A0 Tokens =A0Owns =A0 Host ID =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 Rack
UN =A023.21.11.193 =A0 =A0 =A021 KB =A0 =A0 =A0256 =A0 =A0 100.0% =A09= d70f022-03cf-488a-807d-22e991761483 =A01a

It looks to me like nodes didn't communicate with each other like I= thought they would, and timed out waiting for gossip to tell them which no= des are in the ring (I'm new to Cassandra, but this is my guess... cert= ainly 30-second timeouts look suspicious). I checked with telnet, and from = each node I can connect to port 7000 on the other node (both on internal an= d public IP). I feel like I have made a beginner's mistake. Anyone has = a suggestion where to look next?

- Boris
--e89a8fb205484a311804d58c6a5c--