Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
From: Akmal Abbasov <akmal.abbasov@icloud.com>
Content-type: multipart/alternative;
 boundary="Apple-Mail=_E75F5258-5A06-4952-A6AD-9CA5E8BBA6E8"
Message-id: <DA2C80FC-104C-499F-AF4D-B8A7C3EF9B5E@icloud.com>
MIME-version: 1.0 (Mac OS X Mail 8.2 \(2104\))
Subject: Re: High iowait in idle hbase cluster
Date: Wed, 02 Sep 2015 21:11:05 +0200
References: <7AFF456D-B058-497F-B378-D7DA20B93263@icloud.com>
 <CALte62xBuQJvJw7yhFs_NJD=Ziwky0qP4dEYwpCeBugxaYtLZg@mail.gmail.com>
 <ED4EBE40-10B1-4F38-8647-E75A9481D823@icloud.com>
 <CALte62xJRRJtpkyYsryY8U=bn5PcSWMT6XBYVQrXBD-32jn4OQ@mail.gmail.com>
To: user@hadoop.apache.org
In-reply-to: 
 <CALte62xJRRJtpkyYsryY8U=bn5PcSWMT6XBYVQrXBD-32jn4OQ@mail.gmail.com>


--Apple-Mail=_E75F5258-5A06-4952-A6AD-9CA5E8BBA6E8
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hi Ted,
I=E2=80=99ve checked the time when addresses were changed, and this =
strange behaviour started weeks before it.

yes, 10.10.8.55 is region server and 10.10.8.54 is a hbase master.
any thoughts?

Thanks

> On 02 Sep 2015, at 18:45, Ted Yu <yuzhihong@gmail.com> wrote:
>=20
> bq. change the ip addresses of the cluster nodes
>=20
> Did this happen recently ? If high iowait was observed after the =
change (you can look at ganglia graph), there is a chance that the =
change was related.
>=20
> BTW I assume 10.10.8.55 <http://10.10.8.55:50010/> is where your =
region server resides.
>=20
> Cheers
>=20
> On Wed, Sep 2, 2015 at 9:39 AM, Akmal Abbasov =
<akmal.abbasov@icloud.com <mailto:akmal.abbasov@icloud.com>> wrote:
> Hi Ted,
> sorry forget to mention
>=20
>> release of hbase / hadoop you're using
>=20
> hbase hbase-0.98.7-hadoop2, hadoop hadoop-2.5.1
>=20
>> were region servers doing compaction ?
>=20
> I=E2=80=99ve run major compactions manually earlier today, but it =
seems that they already completed, looking at the compactionQueueSize.
>=20
>> have you checked region server logs ?
> The logs of datanode is full of this kind of messages
> 2015-09-02 16:37:06,950 INFO =
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: =
/10.10.8.55:50010 <http://10.10.8.55:50010/>, dest: /10.10.8.54:32959 =
<http://10.10.8.54:32959/>, bytes: 19673, op: HDFS_READ, cliID: =
DFSClient_NONMAPREDUCE_1225374853_1, offset: 0, srvID: =
ee7d0634-89a3-4ada-a8ad-7848217327be, blockid: =
BP-329084760-10.32.0.180-1387281790961:blk_1075277914_1540222, duration: =
7881815
>=20
> p.s. we had to change the ip addresses of the cluster nodes, is it =
relevant?
>=20
> Thanks.
>=20
>> On 02 Sep 2015, at 18:20, Ted Yu <yuzhihong@gmail.com =
<mailto:yuzhihong@gmail.com>> wrote:
>>=20
>> Please provide some more information:
>>=20
>> release of hbase / hadoop you're using
>> were region servers doing compaction ?
>> have you checked region server logs ?
>>=20
>> Thanks
>>=20
>> On Wed, Sep 2, 2015 at 9:11 AM, Akmal Abbasov =
<akmal.abbasov@icloud.com <mailto:akmal.abbasov@icloud.com>> wrote:
>> Hi,
>> I=E2=80=99m having strange behaviour in hbase cluster. It is almost =
idle, only <5 puts and gets.
>> But the data in hdfs is increasing, and region servers have very high =
iowait(>100, in 2 core CPU).
>> iotop shows that datanode process is reading and writing all the =
time.
>> Any suggestions?
>>=20
>> Thanks.
>>=20
>=20
>=20


--Apple-Mail=_E75F5258-5A06-4952-A6AD-9CA5E8BBA6E8
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><div class=3D"">Hi Ted,</div><div class=3D"">I=E2=80=99ve =
checked the time when addresses were changed, and this strange behaviour =
started weeks before it.</div><div class=3D""><br class=3D""></div><div =
class=3D"">yes, 10.10.8.55 is region server and 10.10.8.54 is a hbase =
master.</div><div class=3D"">any thoughts?</div><div class=3D""><br =
class=3D""></div><div class=3D"">Thanks</div><br =
class=3D""><div><blockquote type=3D"cite" class=3D""><div class=3D"">On =
02 Sep 2015, at 18:45, Ted Yu &lt;<a href=3D"mailto:yuzhihong@gmail.com" =
class=3D"">yuzhihong@gmail.com</a>&gt; wrote:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" =
class=3D"">bq.&nbsp;<span style=3D"font-size:12.8000001907349px" =
class=3D"">change the ip addresses of the cluster nodes</span><div =
class=3D""><span style=3D"font-size:12.8000001907349px" class=3D""><br =
class=3D""></span></div><div class=3D""><span =
style=3D"font-size:12.8000001907349px" class=3D"">Did this happen =
recently ? If high iowait was observed after the change (you can look at =
ganglia graph), there is a chance that the change was =
related.</span></div><div class=3D""><span =
style=3D"font-size:12.8000001907349px" class=3D""><br =
class=3D""></span></div><div class=3D""><span =
style=3D"font-size:12.8000001907349px" class=3D"">BTW I =
assume&nbsp;</span><a href=3D"http://10.10.8.55:50010/" target=3D"_blank" =
style=3D"font-size:12.8000001907349px" class=3D"">10.10.8.55</a>&nbsp;is =
where your region server resides.</div><div class=3D""><span =
style=3D"font-size:12.8000001907349px" class=3D""><br =
class=3D""></span></div><div class=3D""><span =
style=3D"font-size:12.8000001907349px" =
class=3D"">Cheers</span></div></div><div class=3D"gmail_extra"><br =
class=3D""><div class=3D"gmail_quote">On Wed, Sep 2, 2015 at 9:39 AM, =
Akmal Abbasov <span dir=3D"ltr" class=3D"">&lt;<a =
href=3D"mailto:akmal.abbasov@icloud.com" target=3D"_blank" =
class=3D"">akmal.abbasov@icloud.com</a>&gt;</span> wrote:<br =
class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word" class=3D"">Hi Ted,<div class=3D"">sorry =
forget to mention</div><div class=3D""><br class=3D""><div =
class=3D""><span class=3D""><blockquote type=3D"cite" class=3D""><div =
dir=3D"ltr" class=3D""><div class=3D"">release of hbase / hadoop you're =
using</div></div></blockquote></span><div class=3D""><div dir=3D"ltr" =
class=3D""><div class=3D"">hbase&nbsp;hbase-0.98.7-hadoop2, =
hadoop&nbsp;hadoop-2.5.1</div><div class=3D""><br class=3D""></div><div =
class=3D""><span class=3D""><blockquote type=3D"cite" class=3D""><div =
dir=3D"ltr" class=3D""><div class=3D"">were region servers doing =
compaction ?</div></div></blockquote></span><div class=3D""><div =
dir=3D"ltr" class=3D""><div class=3D"">I=E2=80=99ve run major =
compactions manually earlier today, but it seems that they already =
completed, looking at the compactionQueueSize.</div><div class=3D""><br =
class=3D""></div></div></div></div><div class=3D""><span =
class=3D""><blockquote type=3D"cite" class=3D""><div dir=3D"ltr" =
class=3D""><div class=3D"">have you checked region server logs =
?</div></div></blockquote></span>The logs of datanode is full of this =
kind of messages<br class=3D""><div class=3D""><div dir=3D"ltr" =
class=3D""><div class=3D""><div class=3D"">2015-09-02 16:37:06,950 INFO =
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /<a =
href=3D"http://10.10.8.55:50010/" target=3D"_blank" =
class=3D"">10.10.8.55:50010</a>, dest: /<a =
href=3D"http://10.10.8.54:32959/" target=3D"_blank" =
class=3D"">10.10.8.54:32959</a>, bytes: 19673, op: HDFS_READ, cliID: =
DFSClient_NONMAPREDUCE_1225374853_1, offset: 0, srvID: =
ee7d0634-89a3-4ada-a8ad-7848217327be, blockid: =
BP-329084760-10.32.0.180-1387281790961:blk_1075277914_1540222, duration: =
7881815</div><div class=3D""><br class=3D""></div><div class=3D"">p.s. =
we had to change the ip addresses of the cluster nodes, is it =
relevant?</div><div class=3D""><br class=3D""></div><div =
class=3D"">Thanks.</div></div></div></div></div><div class=3D""><br =
class=3D""></div></div></div><div class=3D""><div class=3D"h5"><div =
class=3D""><blockquote type=3D"cite" class=3D""><div class=3D"">On 02 =
Sep 2015, at 18:20, Ted Yu &lt;<a href=3D"mailto:yuzhihong@gmail.com" =
target=3D"_blank" class=3D"">yuzhihong@gmail.com</a>&gt; wrote:</div><br =
class=3D""><div class=3D""><div dir=3D"ltr" class=3D"">Please provide =
some more information:<div class=3D""><br class=3D""></div><div =
class=3D"">release of hbase / hadoop you're using</div><div =
class=3D"">were region servers doing compaction ?</div><div =
class=3D"">have you checked region server logs ?</div><div class=3D""><br =
class=3D""></div><div class=3D"">Thanks</div></div><div =
class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Wed, =
Sep 2, 2015 at 9:11 AM, Akmal Abbasov <span dir=3D"ltr" class=3D"">&lt;<a =
href=3D"mailto:akmal.abbasov@icloud.com" target=3D"_blank" =
class=3D"">akmal.abbasov@icloud.com</a>&gt;</span> wrote:<br =
class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br class=3D"">
I=E2=80=99m having strange behaviour in hbase cluster. It is almost =
idle, only &lt;5 puts and gets.<br class=3D"">
But the data in hdfs is increasing, and region servers have very high =
iowait(&gt;100, in 2 core CPU).<br class=3D"">
iotop shows that datanode process is reading and writing all the =
time.<br class=3D"">
Any suggestions?<br class=3D"">
<br class=3D"">
Thanks.</blockquote></div><br class=3D""></div>
</div></blockquote></div><br =
class=3D""></div></div></div></div></div></blockquote></div><br =
class=3D""></div>
</div></blockquote></div><br class=3D""></body></html>=

--Apple-Mail=_E75F5258-5A06-4952-A6AD-9CA5E8BBA6E8--