incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <watche...@gmail.com>
Subject Re: Network traffic patterns
Date Thu, 17 Nov 2011 08:30:29 GMT
Hi Todd
Yes all equal hardware. Nearly no CPU usage and no memory issues.
Repairs are running in tens of minutes so i don't understand why
replication would be backed up.

Any other ideas?
Le 17 nov. 2011 02:33, "Todd Burruss" <bburruss@expedia.com> a écrit :

> Are all of your machines equal hardware?  Since those machines are sending
> data somewhere, maybe they are behind in replicating and are continuously
> catching up?
>
> Use a tool like tcpdump to find out where the data is going
>
> From: Philippe <watcherfr@gmail.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Tue, 15 Nov 2011 13:22:38 -0800
> To: user <user@cassandra.apache.org>
> Subject: Re: Network traffic patterns
>
> Sorry about the previous message, I've enabled keyboard shortcuts on
> gmail...*sigh*...
>
> Hello,
> I'm trying to understand the network usage I am seeing in my cluster, can
> anyone shed some light?
> It's an RF=3, 12-node, cassandra 0.8.6 cluster. repair is performed on
> each node once a week, with a rolling schedule.
> The nodes are p13,p14,p15...p24 and are consecutive in that order on the
> ring. Each node is only a cassandra database. I am hitting the cluster from
> another server (p4).
>
> p4 is doing this with 20 threads in parallel
>
>    1. read a lot of data (some columns for hundreds to tens of thousands
>    of keys, split into 512-key multigets)
>    2. process the data
>    3. write back a byte array to cassandra (average size is 400 bytes)
>    4. go back to 1
>
> According to my munin graphs, network usage is about as follows. I am not
> surprised at the bias towards p13-p15 as p4 is getting & storing data
> mainly for keys located on one of those nodes.
>
>    - p4 : 1.5Mb/s in and out
>    - p13-p15 : 15Mb/s in and 80Mb/s out
>    - p16-p24 : 45Mb/s in and 5Mb/s out
>
> What I don't understand is why p4 is only seeing 1.5Mb/s while I see
> 80Mb/s on p13 & p15.
>
> The way I understand this:
>
>    - p4 makes a multiget to the cluster, electing to use any node in the
>    cluster (IN traffic for describe the query)
>    - coordinator node replays the query on all 3 replicas (so 3 servers
>    each get the IN traffic, mostly p13-p15)
>    - each server replies to coordinator
>    - coordinator chooses matching values and sends back data to p4
>
> So if p13-p15 are outputting 80Mb/s why am I not seeing 80Mb/s coming into
> p4 which is on the receiving end ?
>
> Thanks
>
> 2011/11/15 Philippe <watcherfr@gmail.com>
>
>> Hello,
>> I'm trying to understand the network usage I am seeing in my cluster, can
>> anyone shed some light?
>> It's an RF=3, 12-node, cassandra 0.8.6 cluster. The nodes are
>> p13,p14,p15...p24 and are consecutive in that order on the ring.
>> Each node is only a cassandra database. I am hitting the cluster from
>> another server (p4).
>>
>> The pattern on p4 is the pattern is to
>>
>>    1. read a lot of data (some columns for hundreds to tens of thousands
>>    of keys, split into 512-key multigets)
>>    2. process the data
>>    3. write back a byte array to cassandra (average size is 400 bytes)
>>
>>
>> p4 reads as
>>
>
>

Mime
View raw message