cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anubhav Kale <Anubhav.K...@microsoft.com.INVALID>
Subject RE: RemoveNode Behavior Question
Date Wed, 22 Feb 2017 18:51:50 GMT
But I don't understand how the replica count is getting restored here. The node that invoked
removenode only owns partial ranges.

-----Original Message-----
From: Brandon Williams [mailto:driftx@gmail.com] 
Sent: Wednesday, February 22, 2017 10:49 AM
To: dev@cassandra.apache.org
Subject: Re: RemoveNode Behavior Question

Every topology operation tries to respect/restore the RF except for assassinate.

On Wed, Feb 22, 2017 at 12:45 PM, Anubhav Kale < Anubhav.Kale@microsoft.com.invalid>
wrote:

> Hello,
>
> Recently, I started noticing an interesting pattern. When I execute 
> "removenode", a subset of the nodes that now own the tokens result it 
> in a CPU spike / disk activity, and sometimes SSTables on those nodes shoot up.
>
> After looking through the code, it appears to me that below function 
> forces data to be streamed from some of the new nodes to the node from 
> where "removenode" is kicked in. Is my understanding correct ?
>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
> b.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e9&dat
> a=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cf22f2e33447f46c5e82a08d45b5
> 38008%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636233861574178675&
> sdata=NGkgls2RTfWTM7MBJ4MuKdxd7pRZiSRGcWDVUmXwG5Q%3D&reserved=0
> 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L
> 2548 <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%
> 2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd
> 3 e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%
> 2Fservice%2FStorageService.java%23L2548&data=02%7C01%
> 7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%
> 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdata=
> JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0>
>
> Our nodes don't run very hot, but it appears this streaming causes 
> them to have issues. If I understand the code correctly, the node 
> that's initiated removenode may still not get all the data for moved 
> over ranges. So, what is the rationale behind trying to build a "partial replica" ?
>
> Maybe, I am not following this correctly so hoping someone can explain.
>
> Thanks !
>
>
Mime
View raw message