hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akmal Abbasov <akmal.abba...@icloud.com>
Subject Re: High iowait in idle hbase cluster
Date Thu, 03 Sep 2015 12:23:56 GMT
Hi Ted,
No there is no short-circuit read configured.
The logs of datanode of the 10.10.8.55 are full of following messages
2015-09-03 12:03:56,324 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:
src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 77, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-483065515_1,
offset: 0, srvID: ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: BP-439084760-10.32.0.180-1387281790961:blk_1075349331_1612273,
duration: 276448307
2015-09-03 12:03:56,494 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:
src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 538, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-483065515_1,
offset: 0, srvID: ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: BP-439084760-10.32.0.180-1387281790961:blk_1075349334_1612276,
duration: 60550244
2015-09-03 12:03:59,561 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:
src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 455, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-483065515_1,
offset: 0, srvID: ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: BP-439084760-10.32.0.180-1387281790961:blk_1075351814_1614757,
duration: 755613819
There are >100.000 of them just for today. The situation with other regionservers are similar.
Node 10.10.8.53 is hbase-master node, and the process on the port is also hbase-master.
So if there is no load on the cluster, why there are so much IO happening?
Any thoughts.
Thanks.

> On 02 Sep 2015, at 21:57, Ted Yu <yuzhihong@gmail.com> wrote:
> 
> I assume you have enabled short-circuit read.
> 
> Can you capture region server stack trace(s) and pastebin them ?
> 
> Thanks
> 
> On Wed, Sep 2, 2015 at 12:11 PM, Akmal Abbasov <akmal.abbasov@icloud.com <mailto:akmal.abbasov@icloud.com>>
wrote:
> Hi Ted,
> I’ve checked the time when addresses were changed, and this strange behaviour started
weeks before it.
> 
> yes, 10.10.8.55 is region server and 10.10.8.54 is a hbase master.
> any thoughts?
> 
> Thanks
> 
>> On 02 Sep 2015, at 18:45, Ted Yu <yuzhihong@gmail.com <mailto:yuzhihong@gmail.com>>
wrote:
>> 
>> bq. change the ip addresses of the cluster nodes
>> 
>> Did this happen recently ? If high iowait was observed after the change (you can
look at ganglia graph), there is a chance that the change was related.
>> 
>> BTW I assume 10.10.8.55 <http://10.10.8.55:50010/> is where your region server
resides.
>> 
>> Cheers
>> 
>> On Wed, Sep 2, 2015 at 9:39 AM, Akmal Abbasov <akmal.abbasov@icloud.com <mailto:akmal.abbasov@icloud.com>>
wrote:
>> Hi Ted,
>> sorry forget to mention
>> 
>>> release of hbase / hadoop you're using
>> 
>> hbase hbase-0.98.7-hadoop2, hadoop hadoop-2.5.1
>> 
>>> were region servers doing compaction ?
>> 
>> I’ve run major compactions manually earlier today, but it seems that they already
completed, looking at the compactionQueueSize.
>> 
>>> have you checked region server logs ?
>> The logs of datanode is full of this kind of messages
>> 2015-09-02 16:37:06,950 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace:
src: /10.10.8.55:50010 <http://10.10.8.55:50010/>, dest: /10.10.8.54:32959 <http://10.10.8.54:32959/>,
bytes: 19673, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_1225374853_1, offset: 0, srvID:
ee7d0634-89a3-4ada-a8ad-7848217327be, blockid: BP-329084760-10.32.0.180-1387281790961:blk_1075277914_1540222,
duration: 7881815
>> 
>> p.s. we had to change the ip addresses of the cluster nodes, is it relevant?
>> 
>> Thanks.
>> 
>>> On 02 Sep 2015, at 18:20, Ted Yu <yuzhihong@gmail.com <mailto:yuzhihong@gmail.com>>
wrote:
>>> 
>>> Please provide some more information:
>>> 
>>> release of hbase / hadoop you're using
>>> were region servers doing compaction ?
>>> have you checked region server logs ?
>>> 
>>> Thanks
>>> 
>>> On Wed, Sep 2, 2015 at 9:11 AM, Akmal Abbasov <akmal.abbasov@icloud.com <mailto:akmal.abbasov@icloud.com>>
wrote:
>>> Hi,
>>> I’m having strange behaviour in hbase cluster. It is almost idle, only <5
puts and gets.
>>> But the data in hdfs is increasing, and region servers have very high iowait(>100,
in 2 core CPU).
>>> iotop shows that datanode process is reading and writing all the time.
>>> Any suggestions?
>>> 
>>> Thanks.
>>> 
>> 
>> 
> 
> 


Mime
View raw message