Hi Boris.

"I feel like I have made a beginner's mistake"
That's an horrible feeling :D. I'll try to help ;)

"cluster_name: 'TS'"
Are you sure you used the same name for both node ?

"I can connect to port 7000"
You can check all the ports needed there http://www.datastax.com/docs/1.2/install/install_ami and open them in security group once and for all so you won't be wondering this anymore.

"listen_address: 10.145.232.190"
"INFO 19:36:32,710 Node /107.22.114.19 state jump to normal"
There is "10.145.232.190" defined as listen address and you logs says that 107.22.114.19 joined the ring and your second ip seems to be 23.21.11.193... When you stop an EC2 server, its internal ip may change. So I recommend you not to do so, but restart them instead. Anyway you should use instance stores and not EBS, and Instance Store can't be stopped so you won't have this issue anymore. Don't trust ip-10-145-232-190 which is configured at first start in /etc/hostname.

"endpoint_snitch: Ec2MultiRegionSnitch"
Maybe should you use endpoint_snitch: Ec2Snitch since all your servers are in the same zone. You will have to use privates ip everywhere and comment the broadcast_address if you do so.


The first node has to start with auto_bootsrap: false, while the 2nd one could use auto_bootsrap: true. Seeds node must be your first node only, a bootstrapping node mustn't be defined as a seed.

"my guess... certainly 30-second timeouts look suspicious"
This is not a timeout but rather a sleep and it is a normal wait while adding a node.

Since your a new user, I guess you have no data. If you want to try some conf you can always "reset" your cassandra node by removing .../cassandra/* (commitlog, data and saved_caches) after stopping Cassandra.

Good luck with this.

Alain


2013/2/12 Boris Solovyov <boris.solovyov@gmail.com>
I've configured 2-node cluster in EC2, key settings as follows:

cluster_name: 'TS'
num_tokens: 256
seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
listen_address: 10.145.232.190
rpc_address: 0.0.0.0
endpoint_snitch: Ec2MultiRegionSnitch

On other node, it is similar, but of course the listen and broadcast address are different. Now, when I start Cassandra, I see in the logs

INFO 19:35:32,348 JOINING: waiting for ring information

And then after 30 seconds, it says a bunch of things like this:

JOINING: schema complete, ready to bootstrap
JOINING: getting bootstrap token
Enqueuing flush of Memtable...
JOINING: sleeping 30000 ms for pending range setup
JOINING: Starting to bootstrap...
Bootstrap completed! for the tokens [....]

Finally, after some more memtable flushing,

INFO 19:36:32,710 Node /107.22.114.19 state jump to normal
INFO 19:36:32,722 Startup completed! Now serving reads.

Now, I start the other node, and I see basically the same thing in the logs.

Running nodetool status, I see what looks like two single-node clusters!

[root@ip-10-147-171-160 ~]# nodetool status
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address           Load       Tokens  Owns   Host ID                               Rack
UN  107.22.114.19     21 KB      256     100.0%  f7a24bd2-8cb9-499d-806c-d9e548f34b8d  1a

[root@ip-10-145-232-190 ~]# nodetool status
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address           Load       Tokens  Owns   Host ID                               Rack
UN  23.21.11.193      21 KB      256     100.0%  9d70f022-03cf-488a-807d-22e991761483  1a

It looks to me like nodes didn't communicate with each other like I thought they would, and timed out waiting for gossip to tell them which nodes are in the ring (I'm new to Cassandra, but this is my guess... certainly 30-second timeouts look suspicious). I checked with telnet, and from each node I can connect to port 7000 on the other node (both on internal and public IP). I feel like I have made a beginner's mistake. Anyone has a suggestion where to look next?

- Boris