incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: Load balancing issue with virtual nodes
Date Mon, 28 Apr 2014 21:30:11 GMT
Hello all

 Some update about the issue.

 After wiping completely all sstable/commitlog/saved_caches folder and
restart the cluster from scratch, we still experience weird figures. After
the restart, nodetool status does not show an exact balance of 50% of data
for each node :


Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address  Load Tokens Owns (effective) Host ID Rack
UN host1 48.57 KB 256 *51.6%*  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
UN host2 48.57 KB 256 *48.4%*  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1


As you can see, the % is very close to 50% but not exactly 50%

 What can explain that ? Can it be network connection issue during token
initial shuffle phase ?

P.S: both host1 and host2 are supposed to have exactly the same hardware

Regards

 Duy Hai DOAN


On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan <batranub@yahoo.com>wrote:

> I don't know about hector but the datastax java driver needs just one ip
> from the cluster and it will discover the rest of the nodes. Then by
> default it will do a round robin when sending requests. So if Hector does
> the same the patterb will againg appear.
> Did you look at the size of the dirs?
> That documentation is for C* 0.8. It's old. But depending on your boxes
> you might reach CPU bottleneck. Might want to google for write path in
> cassandra..  According to that, there is not much to do when writes come
> in...
>   On Friday, April 25, 2014 12:00 AM, DuyHai Doan <doanduyhai@gmail.com>
> wrote:
>  I did some experiments.
>
>  Let's say we have node1 and node2
>
> First, I configured Hector with node1 & node2 as hosts and I saw that only
> node1 has high CPU load
>
> To eliminate the "client connection" issue, I re-test with only node2
> provided as host for Hector. Same pattern. CPU load is above 50% on node1
> and below 10% on node2.
>
> It means that node2 is playing as coordinator and forward many write/read
> request to node1
>
>  Why did I look at CPU load and not iostat & al ?
>
>  Because I have a very intensive write work load with read-only-once
> pattern. I've read here (
> http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
> that heavy write in C* is more CPU bound but maybe the info may be outdated
> and no longer true
>
>  Regards
>
>  Duy Hai DOAN
>
>
> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <michael@pbandjelly.org>wrote:
>
> On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>
>   Client used = Hector 1.1-4
>   Default Load Balancing connection policy
>   Both nodes addresses are provided to Hector so according to its
> connection policy, the client should switch alternatively between both
> nodes
>
>
> OK, so is only one connection being established to one node for one bulk
> write operation? Or are multiple connections being made to both nodes and
> writes performed on both?
>
> --
> Michael
>
>
>
>
>

Mime
View raw message