Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of watcherfr@gmail.com designates
 74.125.82.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAHwsXYm_cupdOfmB2n1ghZQ8S2jNDh4xzKtZn=RKt-jj5=DC5w@mail.gmail.com>
References: 
 <CAHwsXYm_cupdOfmB2n1ghZQ8S2jNDh4xzKtZn=RKt-jj5=DC5w@mail.gmail.com>
Date: Tue, 15 Nov 2011 22:22:38 +0100
Message-ID: 
 <CAHwsXYm9E+7R9DQ=-Umqn8pMJyb3zdk0gV60gT0iwAUbdvDS8w@mail.gmail.com>
Subject: Re: Network traffic patterns
From: Philippe <watcherfr@gmail.com>
To: user <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=f46d0442834635060f04b1cc96eb

--f46d0442834635060f04b1cc96eb
Content-Type: text/plain; charset=ISO-8859-1

Sorry about the previous message, I've enabled keyboard shortcuts on
gmail...*sigh*...

Hello,
I'm trying to understand the network usage I am seeing in my cluster, can
anyone shed some light?
It's an RF=3, 12-node, cassandra 0.8.6 cluster. repair is performed on each
node once a week, with a rolling schedule.
The nodes are p13,p14,p15...p24 and are consecutive in that order on the
ring. Each node is only a cassandra database. I am hitting the cluster from
another server (p4).

p4 is doing this with 20 threads in parallel

   1. read a lot of data (some columns for hundreds to tens of thousands of
   keys, split into 512-key multigets)
   2. process the data
   3. write back a byte array to cassandra (average size is 400 bytes)
   4. go back to 1

According to my munin graphs, network usage is about as follows. I am not
surprised at the bias towards p13-p15 as p4 is getting & storing data
mainly for keys located on one of those nodes.

   - p4 : 1.5Mb/s in and out
   - p13-p15 : 15Mb/s in and 80Mb/s out
   - p16-p24 : 45Mb/s in and 5Mb/s out

What I don't understand is why p4 is only seeing 1.5Mb/s while I see 80Mb/s
on p13 & p15.

The way I understand this:

   - p4 makes a multiget to the cluster, electing to use any node in the
   cluster (IN traffic for describe the query)
   - coordinator node replays the query on all 3 replicas (so 3 servers
   each get the IN traffic, mostly p13-p15)
   - each server replies to coordinator
   - coordinator chooses matching values and sends back data to p4

So if p13-p15 are outputting 80Mb/s why am I not seeing 80Mb/s coming into
p4 which is on the receiving end ?

Thanks

2011/11/15 Philippe <watcherfr@gmail.com>

> Hello,
> I'm trying to understand the network usage I am seeing in my cluster, can
> anyone shed some light?
> It's an RF=3, 12-node, cassandra 0.8.6 cluster. The nodes are
> p13,p14,p15...p24 and are consecutive in that order on the ring.
> Each node is only a cassandra database. I am hitting the cluster from
> another server (p4).
>
> The pattern on p4 is the pattern is to
>
>    1. read a lot of data (some columns for hundreds to tens of thousands
>    of keys, split into 512-key multigets)
>    2. process the data
>    3. write back a byte array to cassandra (average size is 400 bytes)
>
>
> p4 reads as
>

--f46d0442834635060f04b1cc96eb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Sorry about the previous message, I&#39;ve enabled keyboard shortcuts on gm=
ail...*sigh*...<br><div><br></div><div><span class=3D"Apple-style-span" sty=
le=3D"font-family: arial, sans-serif; font-size: 13px; background-color: rg=
b(255, 255, 255); ">Hello,</span><div style=3D"font-family: arial, sans-ser=
if; font-size: 13px; background-color: rgb(255, 255, 255); ">
I&#39;m trying to understand the network usage I am seeing in my cluster, c=
an anyone shed some light?</div><div style=3D"font-family: arial, sans-seri=
f; font-size: 13px; background-color: rgb(255, 255, 255); ">It&#39;s an RF=
=3D3, 12-node, cassandra 0.8.6 cluster. repair is performed on each node on=
ce a week, with a rolling schedule.</div>
<div style=3D"font-family: arial, sans-serif; font-size: 13px; background-c=
olor: rgb(255, 255, 255); ">The nodes are p13,p14,p15...p24 and are consecu=
tive in that order on the ring.=A0Each node is only a cassandra database. I=
 am hitting the cluster from another server (p4).</div>
<div style=3D"font-family: arial, sans-serif; font-size: 13px; background-c=
olor: rgb(255, 255, 255); "><br></div><div style=3D"font-family: arial, san=
s-serif; font-size: 13px; background-color: rgb(255, 255, 255); ">p4 is doi=
ng this with 20 threads in parallel=A0</div>
<div style=3D"font-family: arial, sans-serif; font-size: 13px; background-c=
olor: rgb(255, 255, 255); "><ol><li style=3D"margin-left: 15px; ">read a lo=
t of data (some columns for hundreds to tens of thousands of keys, split in=
to 512-key multigets)</li>
<li style=3D"margin-left: 15px; ">process the data</li><li style=3D"margin-=
left: 15px; ">write back a byte array to cassandra (average size is 400 byt=
es)</li><li style=3D"margin-left: 15px; ">go back to 1</li></ol></div><div =
style=3D"font-family: arial, sans-serif; font-size: 13px; background-color:=
 rgb(255, 255, 255); ">
According to my munin graphs, network usage is about as follows.=A0I am not=
 surprised at the bias towards p13-p15 as p4 is getting &amp; storing data =
mainly for keys located on one of those nodes.</div><div style=3D"backgroun=
d-color: rgb(255, 255, 255); ">
<ul><li><font class=3D"Apple-style-span" face=3D"arial, sans-serif">p4 : 1.=
5Mb/s in and out</font></li><li><font class=3D"Apple-style-span" face=3D"ar=
ial, sans-serif">p13-p15 :=A015Mb/s in and=A080Mb/s out</font></li><li><fon=
t class=3D"Apple-style-span" face=3D"arial, sans-serif">p16-p24 : 45Mb/s in=
 and 5Mb/s out</font></li>
</ul></div><div style=3D"font-family: arial, sans-serif; font-size: 13px; b=
ackground-color: rgb(255, 255, 255); ">What I don&#39;t understand is why p=
4 is only seeing 1.5Mb/s while I see 80Mb/s on p13 &amp; p15.</div><div sty=
le=3D"font-family: arial, sans-serif; font-size: 13px; background-color: rg=
b(255, 255, 255); ">
<br></div><div style=3D"font-family: arial, sans-serif; font-size: 13px; ba=
ckground-color: rgb(255, 255, 255); ">The way I understand this:</div><div =
style=3D"background-color: rgb(255, 255, 255); "><ul><li><font class=3D"App=
le-style-span" face=3D"arial, sans-serif">p4 makes a multiget to the cluste=
r, electing to use any node in the cluster (IN traffic for describe the que=
ry)</font></li>
<li><font class=3D"Apple-style-span" face=3D"arial, sans-serif">coordinator=
 node replays the query on all 3 replicas (so 3 servers each get the IN tra=
ffic, mostly p13-p15)</font></li><li><font class=3D"Apple-style-span" face=
=3D"arial, sans-serif">each server replies to coordinator</font></li>
<li><font class=3D"Apple-style-span" face=3D"arial, sans-serif">coordinator=
 chooses matching values and sends back data to p4</font></li></ul><div><fo=
nt class=3D"Apple-style-span" face=3D"arial, sans-serif">So if p13-p15 are =
outputting 80Mb/s why am I not seeing 80Mb/s coming into p4 which is on the=
 receiving end ?</font></div>
<div><font class=3D"Apple-style-span" face=3D"arial, sans-serif"><br></font=
></div><div><font class=3D"Apple-style-span" face=3D"arial, sans-serif">Tha=
nks</font></div></div><div style=3D"font-family: arial, sans-serif; font-si=
ze: 13px; background-color: rgb(255, 255, 255); ">
<br></div><div class=3D"gmail_quote">2011/11/15 Philippe <span dir=3D"ltr">=
&lt;<a href=3D"mailto:watcherfr@gmail.com">watcherfr@gmail.com</a>&gt;</spa=
n><br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l=
eft:1px #ccc solid;padding-left:1ex;">
Hello,<div>I&#39;m trying to understand the network usage I am seeing in my=
 cluster, can anyone shed some light?</div><div>It&#39;s an RF=3D3, 12-node=
, cassandra 0.8.6 cluster. The nodes are p13,p14,p15...p24 and are consecut=
ive in that order on the ring.</div>

<div>Each node is only a cassandra database. I am hitting the cluster from =
another server (p4).</div><div><br></div><div>The pattern on p4 is the patt=
ern is to=A0</div><div><ol><li>read a lot of data (some columns for hundred=
s to tens of thousands of keys, split into 512-key multigets)</li>

<li>process the data</li><li>write back a byte array to cassandra (average =
size is 400 bytes)</li></ol></div><div><br></div><div>p4 reads as</div>
</blockquote></div><br></div>

--f46d0442834635060f04b1cc96eb--