cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Haddad (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8494) incremental bootstrap
Date Tue, 16 Dec 2014 21:46:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249016#comment-14249016
] 

Jon Haddad commented on CASSANDRA-8494:
---------------------------------------

Well, it depends.  The issue is written on the assumption that we want to be able to increase
node density, and that currently bootstrapping a 20TB node is problematic.  If we're not going
to push node density, it might not be an issue, but I suspect sticking to "no more than 1TB
per node" is going to fly less and less over time.  

> incremental bootstrap
> ---------------------
>
>                 Key: CASSANDRA-8494
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8494
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jon Haddad
>            Priority: Minor
>
> Current bootstrapping involves (to my knowledge) picking tokens and streaming data before
the node is available for requests.  This can be problematic with "fat nodes", since it may
require 20TB of data to be streamed over before the machine can be useful.  This can result
in a massive window of time before the machine can do anything useful.
> As a potential approach to mitigate the huge window of time before a node is available,
I suggest modifying the bootstrap process to only acquire a single initial token before being
marked UP.  This would likely be a configuration parameter "incremental_bootstrap" or something
similar.
> After the node is bootstrapped with this one token, it could go into UP state, and could
then acquire additional tokens (one or a handful at a time), which would be streamed over
while the node is active and serving requests.  The benefit here is that with the default
256 tokens a node could become an active part of the cluster with less than 1% of it's final
data streamed over.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message