cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandeep Tata (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-195) Improve bootstrap algorithm
Date Tue, 18 Aug 2009 18:04:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744619#action_12744619
] 

Sandeep Tata edited comment on CASSANDRA-195 at 8/18/09 11:03 AM:
------------------------------------------------------------------

>>I don't see where this fixes the serve-reads-from-the-old-nodes-until-bootstrap-complete
problem, can you give a high-level summary of how that works? 

The tokenMetadata_.update calls now requires a bootstrap flag. If this flag is true, the new
node is added to a separate set bootstrapNodes and does not affect calculations for sending
reads and *writes*. 

The node only surfaces in the ring once bootstrap is completed. The new node will *not* have
the writes that arrived during bootstrap -- a consequence of the changes that accommodate
the replication=1 case. We will need writes to see a different ring (different result for
getStorageEndPoints) than the reads -- that's not in this patch.

>>maybe I don't understand how this works -- to me that looks like if we send any update
about a node, everyone will clear the bootstrap flag on it, unless we explicitly set bootstrap
to true. shouldn't we only update bootstrap status when it's explicitly set to true or false?

Hmm, I thought the gossiper always sent the full endpoint state, so it should always include
bootstrap status. If it isn't sent, it means the node is not bootstrapping. See Gossiper.GossipDigestAckVerbHandler

Even otherwise, if oldToken == newToken, the only action taken is deliverHints. The bootstrap
status is not cleared.

      was (Author: sandeep_tata):
    >>I don't see where this fixes the serve-reads-from-the-old-nodes-until-bootstrap-complete
problem, can you give a high-level summary of how that works? 

The tokenMetadata_.update calls now requires a bootstrap flag. If this flag is true, the new
node is added to a separate set bootstrapNodes and does not affect calculations for sending
reads and *writes*. 

The node only surfaces in the ring once bootstrap is completed. The new node will *not* have
the writes that arrived during bootstrap -- a consequence of the changes that accommodate
the replication=1 case. We will need writes to see a different ring (different result for
getStorageEndPoints) than the reads -- that's not in this patch.

>>maybe I don't understand how this works -- to me that looks like if we send any update
about a node, everyone will clear the bootstrap flag on it, unless we explicitly set bootstrap
to true. shouldn't we only update bootstrap status when it's explicitly set to true or false?

Hmm, I thought the gossiper always sent the full endpoint state, so it should always include
bootstrap status. If it isn't sent, it means the node is not bootstrapping. See:

Gossiper.makeRandomGossipDigest:
      EndPointState epState = endPointStateMap_.get(localEndPoint_);

Even otherwise, if oldToken == newToken, the only action taken is deliverHints. The bootstrap
status is not cleared.
  
> Improve bootstrap algorithm
> ---------------------------
>
>                 Key: CASSANDRA-195
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-195
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: all
>            Reporter: Sandeep Tata
>            Assignee: Sandeep Tata
>             Fix For: 0.5
>
>         Attachments: 195-v1.patch, 195-v2.patch, 195-v3-delta1.patch, 195-v3.patch, 195-v4.patch
>
>
> When you add a node to an existing cluster and the map gets updated, the new node may
respond to read requests by saying it doesn't have any of the data until it gets the data
from the node(s) the previously owned this range (the load-balancing code, when working properly
can take care of this). While this behaviour is compatible with eventual consistency, it would
be much friendlier for the new node not to "surface" in the EndPoint maps for reads until
it has transferred the data over from the old nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message