cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anubhav Kale <Anubhav.K...@microsoft.com>
Subject RE: Nodetool rebuild question
Date Wed, 05 Oct 2016 20:56:47 GMT
Thanks.

We always set RF to 0 and then “removenode” all nodes in the DC that we want to decom.
So, I highly doubt that is the problem. Plus, #SSTables on a given node on average is ~2000
(we have 140 nodes in one ring and two rings overall).

From: Jeff Jirsa [mailto:jeff.jirsa@crowdstrike.com]
Sent: Wednesday, October 5, 2016 1:44 PM
To: user@cassandra.apache.org
Subject: Re: Nodetool rebuild question

Both of your statements are true.

During your decom, you likely streamed LOTs of sstables to the remaining nodes (especially
true if you didn’t drop the replication factor to 0 for the DC you decommissioned). Since
those tens of thousands of sstables take a while to compact, if you then rebuild (or bootstrap)
before compaction is done, you’ll get a LOT of extra sstables.

This is one of the reasons that people with large clusters don’t use vnodes – if you needed
to bootstrap ~100 more nodes into a cluster, you’d have to wait potentially a day or more
per node to compact away the leftovers before bootstrapping the next, which is prohibitive
at scale.


-          Jeff

From: Anubhav Kale <Anubhav.Kale@microsoft.com<mailto:Anubhav.Kale@microsoft.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, October 5, 2016 at 1:34 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Nodetool rebuild question

Hello,

As part of rebuild, I noticed that the destination node gets -tmp- files from other nodes.
Are following statements correct ?


1.       The files are written to disk without going through memtables.

2.       Regular compactors eventually compact them to bring down # SSTables to a reasonable
number.

We have noticed that the destination node has created > 40K *Data* files in first hour
of streaming itself. We have not seen such pattern before, so trying to understand what could
have changed. (We do use Vnodes and We haven’t increased # nodes recently, but have decomm-ed
a DC).

Thanks much !
____________________________________________________________________
CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and may be legally
privileged. If you are not the intended recipient, do not disclose, copy, distribute, or use
this email or any attachments. If you have received this in error please let the sender know
and then delete the email and all attachments.
Mime
View raw message