incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <>
Subject Decommissioned nodes not leaving and Hinted Handoff flood
Date Tue, 09 Jul 2013 12:47:18 GMT


I have removed 4 nodes with "nodetool decommission". 2 of them have left
with no issue, while the 2 others nodes remained "leaving" even after
streaming their data.

The only specific thing of these 2 nodes is that they had a lot of hints
pending. Hints from a node that couldn't come back and that I removed
earlier (because of the heavy load induced by Hinted Handoff while coming
back, which induced a lot of latencies in our app. This node didn't manage
to come back after 10 minutes, I removed it).

So there I faced 3 bugs (or problems) :

1 - At first, one of my node came down 5 min and when it came back it get
flooded by Hinted Handoff so hard that it could not handle the real time
queries properly. I haven't find a way to prioritize app queries rather
than Hinted Handoff.
2 - Nodes keep hints for a node that has been removed.
3 - Nodes with 500MB to 3GB hints stored for a removed node can't be
decommissioned, they stuck after streaming their data.

As solutions for this 3 issues I did the following:

Solution to 1 - I removed this down node (nodetool removenode)
Solution to 2 - Stop the node remove system hints
Solution to 3 - Stop the node and removenode instead of decommission

Now I have no more issue, yet I felt I had to report this. Maybe my
experience can help users to get out of tricky situations and commiters to
detect some issues,  specially about hinted handoff.


View raw message