Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: neutral (nike.apache.org: 98.139.253.105 is neither permitted
 nor denied by domain of jatyler@yahoo-inc.com)
From: Jason Tyler <jatyler@yahoo-inc.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
CC: Francois Richard <frichard@yahoo-inc.com>
Subject: nodetool move seems slow
Thread-Topic: nodetool move seems slow
Thread-Index: AQHPgDzJAfSAmz82EkCI/jHqjToEiw==
Date: Wed, 4 Jun 2014 21:34:37 +0000
Message-ID: <CFB4D6B2.1CCA2%jatyler@yahoo-inc.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_CFB4D6B21CCA2jatyleryahooinccom_"
MIME-Version: 1.0

--_000_CFB4D6B21CCA2jatyleryahooinccom_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

Hello,

We have a 5-node cluster runing cassandra 1.2.16, with a significant amount=
 of data:


Address        Rack        Status State   Load            Owns             =
   Token

                                                                           =
   6783174585269344219

10.198.xx.xx1  rack1       Up     Normal  2.59 TB         60.00%           =
   -9223372036854775808

10.198.xx.xx2  rack1       Up     Normal  1.49 TB         40.00%           =
   -5534023222112865485

10.198.xx.xx3  rack1       Up     Normal  2.18 TB         53.23%           =
   -1844674407370955162

10.198.xx.xx4  rack1       Up     Normal  2.86 TB         80.00%           =
   5534023222112865484

10.198.xx.xx5  rack1       Up     Moving  2.32 TB         66.77%           =
   6783174585269344219


The first three nodes (.xx1 - .xx3 above) were at the desired tokens, so I =
issued a move on .xx4:

nodetool move 1844674407370955161


That was about 40hrs ago!


When I do nodetool netstats, I do see apparent progress:


jatyler@xx4:~$ nodetool netstats

Mode: MOVING

Not sending any streams.

Streaming from: /10.198.xx.xx2

   SyncCore: /var/cassandra/data/SyncCore/file-ic-31475-Data.db sections=3D=
1 progress=3D0/77699597 - 0%

=85

   SyncCore: /var/cassandra/data/SyncCore/anotherFile-ic-32252-Data.db sect=
ions=3D1 progress=3D0/1254063427 - 0%

Read Repair Statistics:

Attempted: 8047367

Mismatch (Blocking): 97327

Mismatch (Background): 74369

Pool Name                    Active   Pending      Completed

Commands                        n/a         0      472255111

Responses                       n/a         1      749751322


I wrote 'apparent progress' because it reports =93MOVING=94 and the Pending=
 Commands/Responses are changing over time.  However, I haven=92t seen the =
individual .db files progress go above 0%.

Meanwhile, the system appears to have plenty of unused bandwidth, from 'ios=
tat -x -m 1':


Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz =
avgqu-sz   await  svctm  %util

sda               0.00    56.00 1338.00  171.00    57.59     0.89    79.36 =
    0.57    0.38   0.17  25.30


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          22.77    1.82    2.35    0.20    0.00   72.86


Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz =
avgqu-sz   await  svctm  %util

sda               0.00     0.00  785.00    0.00    33.80     0.00    88.17 =
    0.27    0.35   0.18  14.10


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          20.16    2.05    2.22    0.20    0.00   75.37


Is 40 hours too long for this move?  Should I be seeing individual .db file=
s report more progress?  Should I start with the first box (even though the=
 token appears correct)?


Any thoughts would be greatly appreciated.

THX


Cheers,

~Jason
*******

--_000_CFB4D6B21CCA2jatyleryahooinccom_
Content-Type: text/html; charset="Windows-1252"
Content-ID: <834A5562A93B7A458E756D7F9018BA6B@yforest.corp.yahoo.com>
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DWindows-1=
252">
</head>
<body style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-lin=
e-break: after-white-space; color: rgb(0, 0, 0);">
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">Hello,</d=
iv>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">We have a=
 5-node cluster runing cassandra 1.2.16, with a significant amount of data:=
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">Address&nbsp=
; &nbsp; &nbsp; &nbsp; Rack&nbsp; &nbsp; &nbsp; &nbsp; Status State &nbsp; =
Load&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Owns&nbsp; &nbsp; &nbsp; &nbs=
p; &nbsp; &nbsp; &nbsp; &nbsp; Token&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbs=
p; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &n=
bsp; &nbsp; &nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">&nbsp; &nbsp=
; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nb=
sp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 6783174585269344219&nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">10.198.xx.xx=
1&nbsp; rack1 &nbsp; &nbsp; &nbsp; Up &nbsp; &nbsp; Normal&nbsp; 2.59 TB &n=
bsp; &nbsp; &nbsp; &nbsp; 60.00%&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; -9223372036854775808 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">10.198.xx.xx=
2&nbsp; rack1 &nbsp; &nbsp; &nbsp; Up &nbsp; &nbsp; Normal&nbsp; 1.49 TB &n=
bsp; &nbsp; &nbsp; &nbsp; 40.00%&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; -5534023222112865485 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">10.198.xx.xx=
3&nbsp; rack1 &nbsp; &nbsp; &nbsp; Up &nbsp; &nbsp; Normal&nbsp; 2.18 TB &n=
bsp; &nbsp; &nbsp; &nbsp; 53.23%&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; -1844674407370955162 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">10.198.xx.xx=
4&nbsp; rack1 &nbsp; &nbsp; &nbsp; Up &nbsp; &nbsp; Normal&nbsp; 2.86 TB &n=
bsp; &nbsp; &nbsp; &nbsp; 80.00%&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; 5534023222112865484&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &=
nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">10.198.xx.xx=
5&nbsp;&nbsp;rack1 &nbsp; &nbsp; &nbsp; Up &nbsp; &nbsp; Moving&nbsp; 2.32 =
TB &nbsp; &nbsp; &nbsp; &nbsp; 66.77%&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nb=
sp; &nbsp; 6783174585269344219&nbsp; &nbsp;</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;"><br>
</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;"><br>
</p>
<p style=3D"margin: 0px;">The first three nodes (.xx1 - .xx3 above) were at=
 the desired tokens, so I issued a move on .xx4:</p>
<p style=3D"margin: 0px;">nodetool move 1844674407370955161&nbsp;</p>
<p style=3D"margin: 0px;"><br>
</p>
<p style=3D"margin: 0px;">That was about 40hrs ago! &nbsp;</p>
<p style=3D"margin: 0px;"><br>
</p>
<p style=3D"margin: 0px;">When I do nodetool netstats, I do see apparent pr=
ogress:</p>
<p style=3D"margin: 0px;"><br>
</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">jatyler@xx4:=
~$ nodetool netstats</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Mode: MOVING=
</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Not sending =
any streams.</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Streaming fr=
om: /10.198.xx.xx2</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;&nbsp;=
 SyncCore: /var/cassandra/data/SyncCore/file-ic-31475-Data.db sections=3D1 =
progress=3D0/77699597 - 0%</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">=85</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;&nbsp;=
 SyncCore: /var/cassandra/data/SyncCore/anotherFile-ic-32252-Data.db sectio=
ns=3D1 progress=3D0/1254063427 - 0%</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Read Repair =
Statistics:</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Attempted: 8=
047367</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Mismatch (Bl=
ocking): 97327</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Mismatch (Ba=
ckground): 74369</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Pool Name&nb=
sp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Active &=
nbsp; Pending&nbsp; &nbsp; &nbsp; Completed</p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Commands&nbs=
p; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &n=
bsp; n/a &nbsp; &nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; &nbsp; 472255111</p>
<p style=3D"margin: 0px;"></p>
<p style=3D"margin: 0px; font-size: 11px; font-family: Menlo;">Responses &n=
bsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
n/a &nbsp; &nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; 749751322</p>
<p style=3D"margin: 0px;"><br>
</p>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">I wrote '=
apparent progress' because it reports =93MOVING=94 and the Pending Commands=
/Responses are changing over time. &nbsp;However, I haven=92t seen the indi=
vidual .db files progress go above 0%.</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">Meanwhile=
, the system appears to have plenty of unused bandwidth, from '<span style=
=3D"font-family: Menlo; font-size: 11px;">iostat -x -m 1'</span>:</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">Device: &nbs=
p; &nbsp; &nbsp; &nbsp; rrqm/s &nbsp; wrqm/s &nbsp; &nbsp; r/s &nbsp; &nbsp=
; w/s&nbsp; &nbsp; rMB/s&nbsp; &nbsp; wMB/s avgrq-sz avgqu-sz &nbsp; await&=
nbsp; svctm&nbsp; %util</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">sda &nbsp; &=
nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.00&nbsp; &nbsp; 56.00 1338.00&nb=
sp; 171.00&nbsp; &nbsp; 57.59 &nbsp; &nbsp; 0.89&nbsp; &nbsp; 79.36 &nbsp; =
&nbsp; 0.57&nbsp; &nbsp; 0.38 &nbsp; 0.17&nbsp; 25.30</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px; min-height: 1=
3px;"><br>
</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">avg-cpu:&nbs=
p; %user &nbsp; %nice %system %iowait&nbsp; %steal &nbsp; %idle</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">&nbsp; &nbsp=
; &nbsp; &nbsp; &nbsp; 22.77&nbsp; &nbsp; 1.82&nbsp; &nbsp; 2.35&nbsp; &nbs=
p; 0.20&nbsp; &nbsp; 0.00 &nbsp; 72.86</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px; min-height: 1=
3px;"><br>
</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">Device: &nbs=
p; &nbsp; &nbsp; &nbsp; rrqm/s &nbsp; wrqm/s &nbsp; &nbsp; r/s &nbsp; &nbsp=
; w/s&nbsp; &nbsp; rMB/s&nbsp; &nbsp; wMB/s avgrq-sz avgqu-sz &nbsp; await&=
nbsp; svctm&nbsp; %util</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">sda &nbsp; &=
nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0.00 &nbsp; &nbsp; 0.00&nbsp; 785.=
00&nbsp; &nbsp; 0.00&nbsp; &nbsp; 33.80 &nbsp; &nbsp; 0.00&nbsp; &nbsp; 88.=
17 &nbsp; &nbsp; 0.27&nbsp; &nbsp; 0.35 &nbsp; 0.18&nbsp; 14.10</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px; min-height: 1=
3px;"><br>
</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">avg-cpu:&nbs=
p; %user &nbsp; %nice %system %iowait&nbsp; %steal &nbsp; %idle</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px;">&nbsp; &nbsp=
; &nbsp; &nbsp; &nbsp; 20.16&nbsp; &nbsp; 2.05&nbsp; &nbsp; 2.22&nbsp; &nbs=
p; 0.20&nbsp; &nbsp; 0.00 &nbsp; 75.37</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px; min-height: 1=
3px;"><br>
</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px; min-height: 1=
3px;"><br>
</p>
<p style=3D"font-size: 11px; font-family: Menlo; margin: 0px; min-height: 1=
3px;"><br>
</p>
<p style=3D"margin: 0px; min-height: 13px;">Is 40 hours too long for this m=
ove? &nbsp;Should I be seeing individual .db files report more progress? &n=
bsp;Should I start with the first box (even though the token appears correc=
t)?</p>
<p style=3D"margin: 0px; min-height: 13px;"><br>
</p>
<p style=3D"margin: 0px; min-height: 13px;">Any thoughts would be greatly a=
ppreciated.</p>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">THX</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">Cheers,</=
div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;"><br>
</div>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">~Jason</d=
iv>
<div style=3D"font-size: 14px; font-family: Calibri, sans-serif;">*******</=
div>
</body>
</html>

--_000_CFB4D6B21CCA2jatyleryahooinccom_--