incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jedd Rashbrooke <jedd.rashbro...@imagini.net>
Subject Re: Dazed and confused with Cassandra on EC2 ...
Date Fri, 17 Sep 2010 15:52:39 GMT
 Hi Dave,

 Thank you for your response.

 I can clarify a couple of things here:

> 2. You grew from 2 nodes to 4, but the original 2 nodes have 200GB and the 2
> new ones have 40 GB.  What's the recommended practice for rebalancing (i.e.,
> when should you do it), what's the actual procedure, and what's the expected
> impact of it?

 + is it likely to cause a problem in the short term if I don't (ie.
if I just wait
 until 'normal activity' to somehow even out the distribution of data).

> 3. Cassandra nodes "disappear".  (I'm not quite clear what this means.)

 Nodetool reports the node as down.  I'm seeing lots of machine-x is DOWN
 in the logs.  Flapping, actually.  I don't have any swap configured (which I've
 read somewhere might induce flapping).

 The machine also feels like it goes on a hiatus - separately, but typically
 observed at the same time.  Tail -f on the Cassandra logs delays for several
 minutes, pending ssh's to the box also stall until 'something' happens that
 releases the machine from its slumber.  Typically that something is a
 message in the logs that a compaction of a hintedhandoff has completed.

 As I say, nmon/top show minimal network & disk activity, and just one
 of the four cores flatlining during this time.  The machine *should* be
 more responsive.

 Actually:   http://pastebin.com/AeM2VgL3

 All the machines referenced in there are ones that are in the cluster now.


> 4. You took a machine offline without decommissioning it from the cluster.
>  Now the machine is gone, but the other nodes (in Gossip logs) report that
> they are still looking for it.  How do you stop nodes from looking for a
> removed node?

 I was attempting to drain the thing first, but that was stalling, so I stopped
 Cassandra then stopped the box.  The storage and config were on EBS
 (persistent disk) so they came back - it's just that the IP address of the
 machine changed.  I typically use my own assigned hostnames (cass-01,
 cass-02, etc, say) but for proper resolution I use the EC2 'internal
hostnames',
 which were updated to all four Cassandra boxes, the other three instances
 of Cassandra were stopped, and then all four brought back up.


 You say you have similar EC2-related thoughts .. have you done much on
 the EC2 hardware so far?  Are you seeing the same kind of thing?

 cheers,
 Jedd.

Mime
View raw message