cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stan Lemon <sle...@salesforce.com>
Subject Re: A tale of a node that never joins...
Date Wed, 19 Nov 2014 16:22:01 GMT
We are currently using 2.0.11

Thanks,
Stan


> Hello Stan
>
>  Which version of Cassandra are you using ? There are some known issues of
> streaming failure that prevent a node from finishing joining
>
>  Regards
>
> On Wed, Nov 19, 2014 at 3:57 PM, Stan Lemon <slemon@salesforce.com> wrote:
>
> > Hello,
> > I'm working on a two data center cluster with 12 nodes in each data
> > center. I recently wanted to add a thirteenth node to one of the data
> > centers to try and validate some load improvements to our hardware
> > configuration. I added the node following DataStax directions (
> >
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
)
> > and the node appeared to bootstrap correctly and start joining.
> >
> > I monitored the load and watched it increase, periodically checking
iotop
> > to make sure there was still a pulse. Eventually the load topped out at
> > roughly 85% of the average of the other nodes, iotop showed lots of
> > activity.  After a few hours iotop stopped showing activity and the
node's
> > load had gone down a small amount, ~50-100mb.  Average load on the other
> > nodes is about ~550gb
> >
> > The first time I tried this I let the process run through the weekend,
> > periodically checking on it.  Something happened Monday morning which
> > caused Cassandra to die, so I restarted the process. The load
immediately
> > began growing, eventually doubling that 85% marker and settling in
around
> > ~935gb, way more than any other node. When it reached this point it did
the
> > same thing though, basically stalled out.
> >
> > The whole time nodetool status just showed "UJ".
> >
> > Finally I aborted and cleared the node's data directory and started
over,
> > but again experienced the same stall out at the 85% mark. The node tool
no
> > time at all to get to that point, it was only a few hours. It's not been
> > sitting at 85% for roughly 20 hours and iotop shows no activity.
> >
> > I am wondering a few things...
> > 1. What's going on?
> > 2. How do I get more information about what is happening with the join
> > process?
> > 3. Has anyone seen this before?
> >
> > Thanks for your help,
> > Stan
> >
> >

Mime
View raw message