incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wright <kwri...@nanigans.com>
Subject Re: Auto-Bootstrap not Auto-Bootstrapping?
Date Thu, 06 Feb 2014 21:42:06 GMT
Is it a seed node?  My understanding is that they do not bootstrap

On Feb 6, 2014 4:23 PM, Thunder Stumpges <thunder.stumpges@gmail.com> wrote:
Hi all,

We recently needed/wanted to reconfigure the disks for our 3-node C*2.0.4 Cassandra setup
and rebuild the server at the same time. Upon adding the newly rebuilt server into the cluster,
it immediately started serving read requests with no data! Then because the latency is so
"good" the vast majority of requests were pushed onto that server. We are using 3 nodes with
RF=3. Why wouldn't the node stream in the needed data before serving? My impression was that
the auto_bootstrap setting was true by default (we have not set it anywhere) and that a new
node entering the cluster would stream in data for its tokens (virtual nodes) prior to serving
requests.

Does this have to do with re-using the same name/ip as the old server which also happens to
be in the seed list on our clients and in cassandra.yaml ?

Our admin did the following steps during this process:

- Stop one of the 3 servers. It then appeared as DOWN to the rest of the cluster.
- Rebuild the system, reconfigure disks (name and ip are same as the server that came down)
  - NOTE: there was NO data left from before on this machine, it is a new bare-metal install
- nodetool removenode <old_host_id> (from one of the other remaining nodes)
  - wait for completion ~15 min
- Start cassandra on new node, wait for it to come up
- nodetool repair (on new node)

Immediately when it came up it was as if we'd lost 1/3 of our data because so many read requests
were hitting this new empty node. There does appear to be streaming data coming into the new
node, but it is still serving many empty reponses.

Another curious thing is that I set all of our reads to Quorum ahead of time hoping if this
did happen again (after the first time caught us out), that the quorum reads would prevent
the bad consistency. This does not appear to have helped.

Any insight as to what the heck went wrong here would be greatly appreciated.

Thanks,
Thunder


Mime
View raw message