cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <>
Subject Re: Scaling a cassandra cluster with auto_bootstrap set to false
Date Thu, 13 Jun 2013 21:20:38 GMT
CL.ONE requests for rows which do not exist are very fast.

On Thu, Jun 13, 2013 at 3:47 PM, Robert Coli <> wrote:

> On Thu, Jun 13, 2013 at 10:47 AM, Markus Klems <>
> wrote:
> > One scaling strategy seems interesting but we don't
> > fully understand what is going on, yet. The strategy works like this:
> > add new nodes to a Cassandra cluster with "auto_bootstrap = false" to
> > avoid streaming to the new nodes.
> If you set auto_bootstrap to false, new nodes take over responsibility
> for a range of the ring but do not receive the data for the range from
> the old nodes. If you read the new node at CL.ONE, you will get the
> answer that data you wrote to the old node does not exist, because the
> new node did not receive it as part of bootstrap. This is probably not
> what you expect.
> > We were a bit surprised that this
> > strategy improved performance considerably and that it worked much
> > better than other strategies that we tried before, both in terms of
> > scaling speed and performance impact during scaling.
> CL.ONE requests for rows which do not exist are very fast.
> > Would it be necessary (in a production environment) to stream the old
> SSTables from the other
> > four nodes at some point in time?
> Bootstrapping is necessary for consistency and durability, yes. If you
> were to :
> 1) start new node without bootstrapping it
> 2) run "cleanup" compaction on the old node
> You would permanently delete the copy of the data that is no longer
> "supposed" to live on the old node. With a RF of 1, that data would be
> permanently gone. With a RF of >1 you have other copies, but if you
> never bootstrap while adding new nodes you are relatively likely to
> not be able to access those copies over time.
> =Rob

View raw message