cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Omri Bahumi <om...@everything.me>
Subject Re: Cassandra 2.1.2 node stuck on joining the cluster
Date Mon, 08 Dec 2014 19:13:58 GMT
Any chance you have something along the path that causes the
connectivity issues?
What's the network connectivity between this node and the other node?

Can you try transferring a big file between the two servers? perhaps
you have an MTU issue that causes TCP PMTU discovery fail.
Can you send large pings between the servers? try pinging them from
both sides with large packets (5000, 10000).

On Mon, Dec 8, 2014 at 3:22 PM, Krzysztof Zarzycki <k.zarzycki@gmail.com> wrote:
> Hi Cassandra users,
>
> I'm trying but failing to join a new (well old, but wiped out/decomissioned)
> node to an existing cluster.
>
> Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I
> start a third node with 2.1.2, it gets to joining state, it bootstraps, i.e.
> streams some data as shown by nodetool netstats, but after some time, it
> gets stuck. From that point nothing gets streamed, the new node stays in
> joining state. I restarted node multiple times, each time it streamed more
> data, but then got stuck again.
>
> Other facts:
>
> I don't see any errors in the log on any of the nodes.
> The connectivity seems fine, I can ping, netcat to port 7000 all ways.
> I have ~ 200 GB load per running node, replication 2, 16 tokens.
> Load of a new node got to around 300GBs now.
>
> The bootstrapping process stops in the middle of streaming some table,
> always after sending exactly 10MB of some SSTable, e.g.:
>
> $ nodetool netstats | grep -P -v "bytes\(100" Mode: NORMAL Bootstrap
> e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files,
> 124933333900 bytes total
> /home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db
> 10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair
> Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch (Background):
> 168721 Pool Name Active Pending Completed Commands n/a 0 55802918 Responses
> n/a 0 425963
>
>
> I'm trying to join this node for several days and I don't know what to do
> with it... I'll be grateful for any help!
>
>
> Cheers,
>
> Krzysztof Zarzycki
>
>

Mime
View raw message