cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-195) Improve bootstrap algorithm
Date Wed, 12 Aug 2009 20:17:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742556#action_12742556
] 

Jonathan Ellis commented on CASSANDRA-195:
------------------------------------------

Thanks a lot for the patch, and the cleanup -- that makes it a lot easier to follow!

overall this is going in the right direction.  some comments:

command line stuff:

the Java Way to do this would be to define -Dbootstrap.  I'm pretty sure bin/cassandra correctly
propagates args to the jvm like that.  (if it does not, we should open a ticket for that.)
 then rather than passing a bootstrap variable around, just have the right place in the code
query the environment.

+            if (StorageService.instance().isBootstrapMode())
+            {
+               /* Don't service reads! */
+               return;
+            }

This should throw some kind of error -- either the other nodes are buggy, sending reads to
a bootstrapping node, or your ops team has screwed up and is allowing client-level traffic
to the node.  neither should fail silently.

Similarly, we should add a verifyNotBootstrapping (better name?) call to CassandraServer instead
of adding && !bootstrapping to all the StorageProxy calls.  (There may be a place
we can do this centrally by overriding the right method from the thrift-generated Cassandra
interface.)

+            cf.addColumn(new Column(BOOTSTRAP, BasicUtilities.shortToByteArray(isBootstrap?(short)1:(short)0)));
+            return new StorageMetadata(token, generation,isBootstrap);

not to be anal, but watch the spacing on the ?: operator and method arguments (this comment
applies to other patch lines too)

+        /* Stored value overrides passed in value */
+        IColumn bootstrapColumn = cf.getColumn(BOOTSTRAP);
+        boolean readBootstrap = (BasicUtilities.byteArrayToShort(bootstrapColumn.value())==1)?true:false;
+        if (!isBootstrap && readBootstrap)
+        {
+            logger_.warn("Probably failed a previous bootstrap! Starting in bootstrap mode.");
+        }

I don't think storing the mode in ST and trying to second-guess the op is a good idea.  If
the op says -b, then we bootstrap.  Otherwise we don't.

+    protected boolean localOnly;

I think this means "send messages out to other nodes to bootstrap me" if true, otherwise,
bootstrap some other node.  Is that right?  It seems like those two operations should be in
difference classes, not a single class doing two different things based on an if statement.

+        if (StorageService.instance().isBootstrapMode())
+        {
+            logger_.error("Cannot bootstrap another node: I'm in bootstrap mode myself!");
+            return;
+        }

If node Z asks node A to bootstrap it, it should be Z's responsibility to make sure that A
is not in bootstrap mode.  (note that bootstrap mode only changes from True to False, never
the other way.)  so I would replace this with an assert !StorageService.instance().isBootstrapMode()

+    public static synchronized SSTableReader renameAndOpen(String dataFileName) throws IOException

fits better in SSTW -- Reader shouldn't be renaming things just as a conceptual purity thing.
:)


> Improve bootstrap algorithm
> ---------------------------
>
>                 Key: CASSANDRA-195
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-195
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: all
>            Reporter: Sandeep Tata
>            Assignee: Sandeep Tata
>             Fix For: 0.5
>
>         Attachments: 195-v1.patch, 195-v2.patch
>
>
> When you add a node to an existing cluster and the map gets updated, the new node may
respond to read requests by saying it doesn't have any of the data until it gets the data
from the node(s) the previously owned this range (the load-balancing code, when working properly
can take care of this). While this behaviour is compatible with eventual consistency, it would
be much friendlier for the new node not to "surface" in the EndPoint maps for reads until
it has transferred the data over from the old nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message