cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sai krishnam raju potturi <pskraj...@gmail.com>
Subject Re: Re : Nodetool Cleanup on multiple nodes in parallel
Date Fri, 09 Oct 2015 14:07:55 GMT
thanks Jonathan. I see a advantage in doing it one AZ or rack at a time.

On Thu, Oct 8, 2015 at 6:41 PM, Jonathan Haddad <jon@jonhaddad.com> wrote:

> My hunch is the bigger your cluster the less impact it will have, as each
> node takes part in smaller and smaller % of total queries.  Considering
> that compaction is always happening, I'd wager if you've got a big cluster
> (as you say you do) you'll probably be ok running several cleanups at a
> time.
>
> I'd say start one, see how your perf is impacted (if at all) and go from
> there.
>
> If you're running a proper snitch you could probably do an entire rack /
> AZ at a time.
>
>
> On Thu, Oct 8, 2015 at 3:08 PM sai krishnam raju potturi <
> pskraju88@gmail.com> wrote:
>
>> We plan to do it during non-peak hours when customer traffic is less.
>> That sums up to 10 nodes a day, which is concerning as we have other data
>> centers to be expanded eventually.
>>
>> Since cleanup is similar to compaction, which is CPU intensive and will
>> effect reads  if this data center were to serve traffic. Is running cleanup
>> in parallel advisable??
>>
>> On Thu, Oct 8, 2015, 17:53 Jonathan Haddad <jon@jonhaddad.com> wrote:
>>
>>> Unless you're close to running out of disk space, what's the harm in it
>>> taking a while?  How big is your DC?  At 45 min per node, you can do 32
>>> nodes a day.  Diverting traffic away from a DC just to run cleanup feels
>>> like overkill to me.
>>>
>>>
>>>
>>> On Thu, Oct 8, 2015 at 2:39 PM sai krishnam raju potturi <
>>> pskraju88@gmail.com> wrote:
>>>
>>>> hi;
>>>>    our cassandra cluster currently uses DSE 4.6. The underlying
>>>> cassandra version is 2.0.14.
>>>>
>>>> We are planning on adding multiple nodes to one of our datacenters.
>>>> This requires "nodetool cleanup". The "nodetool cleanup" operation
>>>> takes around 45 mins for each node.
>>>>
>>>> Datastax documentation recommends running "nodetool cleanup" for one
>>>> node at a time. That would be really long, owing to the size of our
>>>> datacenter.
>>>>
>>>> If we were to divert the read and write traffic away from a particular
>>>> datacenter, could we run "cleanup" on multiple nodes in parallel for
>>>> that datacenter??
>>>>
>>>>
>>>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>>>>
>>>>
>>>> thanks
>>>> Sai
>>>>
>>>

Mime
View raw message