cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Coli (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4162) nodetool disablegossip does not prevent gossip delivery of writes via already-initiated hinted handoff
Date Wed, 18 Apr 2012 17:08:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256705#comment-13256705
] 

Robert Coli commented on CASSANDRA-4162:
----------------------------------------

> Restarting with -Dcassandra.join_ring=false will do that.

It will also result in the paying of sizable startup penalty, far more severe in Cassandra
than in most other databases. I can only speak for myself, but I don't want to pay a startup
penalty (which can in real world be, say, a half hour of clock time!) if I don't have to.
I think most operators who use "disablegossip" and "disablethrift" have a goal of removing
a node from the cluster while keeping it running, in order to avoid this startup penalty.

While I now understand that "dead" has a very specific meaning in cassandra which relates
only to Gossip state, I think it is unambiguous that, given the typical semantic meaning of
"dead" and "alive", people do not expect a "dead" node to be accepting writes. As explicated
in "The Princess Bride," there is a significant difference between "mostly dead" and "all
dead."

"
Miracle Max: Whoo-hoo-hoo, look who knows so much. It just so happens that your friend here
is only MOSTLY dead. There's a big difference between mostly dead and all dead. Mostly dead
is slightly alive. With all dead, well, with all dead there's usually only one thing you can
do. 

Inigo Montoya: What's that? 

Miracle Max: Go through his clothes and look for loose change.
"

My goal with this ticket is to establish the best practice for an operator who wants to make
sure his node is not receiving traffic, but is still up and capable of compacting or rejoining
the cluster without paying startup penalty. It seems so far that the best solution is to use
iptables to firewall off port 7000. 

It is difficult to understand the purpose of "disablethrift" and "disablegossip" if the combination
of the two does not render the node "all dead." I believe most operators will expect them
to render a node "all dead." At the very minimum, it seems inappropriate to state in the help
that nodetool disablegossip renders a node "dead" when in fact it renders it "mostly dead."
                
> nodetool disablegossip does not prevent gossip delivery of writes via already-initiated
hinted handoff
> ------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4162
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4162
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.9
>         Environment: reported on IRC, believe it was a linux environment, nick "rhone",
cassandra 1.0.8
>            Reporter: Robert Coli
>            Priority: Minor
>              Labels: gossip
>
> This ticket derives from #cassandra, aaron_morton and I assisted a user who had run "disablethrift"
and "disablegossip" and was confused as to why he was seeing writes to his node.
> Aaron and I went through a series of debugging questions, user verified that there was
traffic on the gossip port. His node was showing as down from the perspective of other nodes,
and nodetool also showed that gossip was not active.
> Aaron read the code and had the user turn debug logging on. The user saw Hinted Handoff
messages being delivered and Aaron confirmed in the code that a hinted handoff delivery session
only checks gossip state when it first starts. As a result, it will continue to deliver hints
and disregard gossip state on the target node.
> per nodetool docs
> "
> disablegossip          - Disable gossip (effectively marking the node dead)
> "
> I believe most people will be using disablegossip and disablethrift for operational reasons,
and propose that they do not expect HH delivery to continue, via gossip, when they have run
"disablegossip".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message