Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3F6C1755C for ; Mon, 7 Sep 2015 12:16:10 +0000 (UTC) Received: (qmail 55609 invoked by uid 500); 7 Sep 2015 12:16:04 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 55503 invoked by uid 500); 7 Sep 2015 12:16:04 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 55493 invoked by uid 99); 7 Sep 2015 12:16:04 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Sep 2015 12:16:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 8CB4CC1994 for ; Mon, 7 Sep 2015 12:16:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.002 X-Spam-Level: *** X-Spam-Status: No, score=3.002 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id l6gQ5XLis2Zg for ; Mon, 7 Sep 2015 12:15:49 +0000 (UTC) Received: from nk11p18im-asmtp002.me.com (nk11p18im-asmtp002.me.com [17.158.120.161]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id BC882204DB for ; Mon, 7 Sep 2015 12:15:48 +0000 (UTC) Received: from [192.168.1.84] (unknown [79.142.255.38]) by nk11p18im-asmtp002.me.com (Oracle Communications Messaging Server 7.0.5.35.0 64bit (built Mar 31 2015)) with ESMTPSA id <0NUB007PD3E0EH40@nk11p18im-asmtp002.me.com> for user@hadoop.apache.org; Mon, 07 Sep 2015 12:15:40 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2015-09-07_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=1.38180356046291e-09 compositescore=0.999332184103064 phishscore=0 kscore.is_spamscore=0 rbsscore=0.999332184103064 recipient_to_sender_totalscore=0 spamscore=0 urlsuspectscore=0.999332184103064 adultscore=0 kscore.compositescore=0 circleOfTrustscore=0 suspectscore=1 recipient_domain_to_sender_totalscore=0 bulkscore=0 recipient_domain_to_sender_domain_totalscore=0 recipient_to_sender_domain_totalscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1412110000 definitions=main-1509070188 From: Akmal Abbasov Content-type: multipart/alternative; boundary="Apple-Mail=_6A17F28C-DF34-4FDE-8D82-2B37E0E3BB46" Message-id: MIME-version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: High iowait in idle hbase cluster Date: Mon, 07 Sep 2015 14:15:35 +0200 References: <7AFF456D-B058-497F-B378-D7DA20B93263@icloud.com> <7B44404C-AE7C-4D74-92A9-CC1CE4F53516@icloud.com> <9DCDD85D-DBF2-4321-B7C5-B79F3978CDAB@icloud.com> <882FEFD3-4737-4254-982D-B90076522E20@icloud.com> To: user@hadoop.apache.org In-reply-to: X-Mailer: Apple Mail (2.2104) --Apple-Mail=_6A17F28C-DF34-4FDE-8D82-2B37E0E3BB46 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 While looking into this problem, I found that I have large = dncp_block_verification.log.curr and dncp_block_verification.log.prev = files. They are 294G each in the node which has high IOWAIT, even when the = cluster was almost idle. While the others have 0 for dncp_block_verification.log.curr, and <15G = for dncp_block_verification.log.prev. So it looks like https://issues.apache.org/jira/browse/HDFS-6114 = Thanks. > On 04 Sep 2015, at 11:56, Adrien Mogenet = wrote: >=20 > What is your disk configuration? JBOD? If RAID, possibly a = dysfunctional RAID controller, or a constantly-rebuilding array. >=20 > Do you have any idea at which files are linked the read blocks? >=20 > On 4 September 2015 at 11:02, Akmal Abbasov > wrote: > Hi Adrien, > for the last 24 hours all RS are up and running. There was no region = transitions. > The overall cluster iowait has decreased, but still 2 RS have very = high iowait, while there is no load on the cluster. > My assumption with the hight number of HDFS_READ/HDFS_WRITE in RS logs = have failed, since all RS have almost identical number > of HDFS_READ/HDFS_WRITE, while only 2 of them has high iowait. > According to iotop the process which is doing most IO is datanode, and = it is reading constantly. > Why datanode could require reading from disk constantly? > Any ideas? >=20 > Thanks. >=20 >> On 03 Sep 2015, at 18:57, Adrien Mogenet = > wrote: >>=20 >> Is the uptime of RS "normal"? No quick and global reboot that could = lead into a regiongi-reallocation-storm? >>=20 >> On 3 September 2015 at 18:42, Akmal Abbasov > wrote: >> Hi Adrien, >> I=E2=80=99ve tried to run hdfs fsck and hbase hbck, and hdfs is = healthy, also hbase is consistent. >> I=E2=80=99m using default value of the replication, so it is 3. >> There are some under replicated=20 >> HBase master(node 10.10.8.55) is reading constantly from = regionservers. Only today, it send >150.000 HDFS_READ requests to each = regionserver so far, while the hbase cluster is almost idle. >> What could cause this kind of behaviour? >>=20 >> p.s. each node in the cluster have 2 core, 4 gb ram, just in case. >>=20 >> Thanks. >>=20 >>=20 >>> On 03 Sep 2015, at 17:46, Adrien Mogenet = > wrote: >>>=20 >>> Is your HDFS healthy (fsck /)? >>>=20 >>> Same for hbase hbck? >>>=20 >>> What's your replication level? >>>=20 >>> Can you see constant network use as well? >>>=20 >>> Anything than might be triggered by the hbasemaster? (something like = a virtually dead RS, due to ZK race-condition, etc.) >>>=20 >>> Your 3-weeks-ago balancer shouldn't have any effect if you've ran a = major compaction, successfully, yesterday. >>>=20 >>> On 3 September 2015 at 16:32, Akmal Abbasov = > wrote: >>> I=E2=80=99ve started HDFS balancer, but then stopped it immediately = after knowing that it is not a good idea. >>> but it was around 3 weeks ago, is it possible that it had an = influence on the cluster behaviour I=E2=80=99m having now? >>> Thanks. >>>=20 >>>> On 03 Sep 2015, at 14:23, Akmal Abbasov > wrote: >>>>=20 >>>> Hi Ted, >>>> No there is no short-circuit read configured. >>>> The logs of datanode of the 10.10.8.55 are full of following = messages >>>> 2015-09-03 12:03:56,324 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: = /10.10.8.55:50010 , dest: /10.10.8.53:58622 = , bytes: 77, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: = BP-439084760-10.32.0.180-1387281790961:blk_1075349331_1612273, duration: = 276448307 >>>> 2015-09-03 12:03:56,494 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: = /10.10.8.55:50010 , dest: /10.10.8.53:58622 = , bytes: 538, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: = BP-439084760-10.32.0.180-1387281790961:blk_1075349334_1612276, duration: = 60550244 >>>> 2015-09-03 12:03:59,561 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: = /10.10.8.55:50010 , dest: /10.10.8.53:58622 = , bytes: 455, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: = BP-439084760-10.32.0.180-1387281790961:blk_1075351814_1614757, duration: = 755613819 >>>> There are >100.000 of them just for today. The situation with other = regionservers are similar. >>>> Node 10.10.8.53 is hbase-master node, and the process on the port = is also hbase-master. >>>> So if there is no load on the cluster, why there are so much IO = happening? >>>> Any thoughts. >>>> Thanks. >>>>=20 >>>>> On 02 Sep 2015, at 21:57, Ted Yu > wrote: >>>>>=20 >>>>> I assume you have enabled short-circuit read. >>>>>=20 >>>>> Can you capture region server stack trace(s) and pastebin them ? >>>>>=20 >>>>> Thanks >>>>>=20 >>>>> On Wed, Sep 2, 2015 at 12:11 PM, Akmal Abbasov = > wrote: >>>>> Hi Ted, >>>>> I=E2=80=99ve checked the time when addresses were changed, and = this strange behaviour started weeks before it. >>>>>=20 >>>>> yes, 10.10.8.55 is region server and 10.10.8.54 is a hbase master. >>>>> any thoughts? >>>>>=20 >>>>> Thanks >>>>>=20 >>>>>> On 02 Sep 2015, at 18:45, Ted Yu > wrote: >>>>>>=20 >>>>>> bq. change the ip addresses of the cluster nodes >>>>>>=20 >>>>>> Did this happen recently ? If high iowait was observed after the = change (you can look at ganglia graph), there is a chance that the = change was related. >>>>>>=20 >>>>>> BTW I assume 10.10.8.55 is where your = region server resides. >>>>>>=20 >>>>>> Cheers >>>>>>=20 >>>>>> On Wed, Sep 2, 2015 at 9:39 AM, Akmal Abbasov = > wrote: >>>>>> Hi Ted, >>>>>> sorry forget to mention >>>>>>=20 >>>>>>> release of hbase / hadoop you're using >>>>>>=20 >>>>>> hbase hbase-0.98.7-hadoop2, hadoop hadoop-2.5.1 >>>>>>=20 >>>>>>> were region servers doing compaction ? >>>>>>=20 >>>>>> I=E2=80=99ve run major compactions manually earlier today, but it = seems that they already completed, looking at the compactionQueueSize. >>>>>>=20 >>>>>>> have you checked region server logs ? >>>>>> The logs of datanode is full of this kind of messages >>>>>> 2015-09-02 16:37:06,950 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: = /10.10.8.55:50010 , dest: /10.10.8.54:32959 = , bytes: 19673, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_1225374853_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848217327be, blockid: = BP-329084760-10.32.0.180-1387281790961:blk_1075277914_1540222, duration: = 7881815 >>>>>>=20 >>>>>> p.s. we had to change the ip addresses of the cluster nodes, is = it relevant? >>>>>>=20 >>>>>> Thanks. >>>>>>=20 >>>>>>> On 02 Sep 2015, at 18:20, Ted Yu > wrote: >>>>>>>=20 >>>>>>> Please provide some more information: >>>>>>>=20 >>>>>>> release of hbase / hadoop you're using >>>>>>> were region servers doing compaction ? >>>>>>> have you checked region server logs ? >>>>>>>=20 >>>>>>> Thanks >>>>>>>=20 >>>>>>> On Wed, Sep 2, 2015 at 9:11 AM, Akmal Abbasov = > wrote: >>>>>>> Hi, >>>>>>> I=E2=80=99m having strange behaviour in hbase cluster. It is = almost idle, only <5 puts and gets. >>>>>>> But the data in hdfs is increasing, and region servers have very = high iowait(>100, in 2 core CPU). >>>>>>> iotop shows that datanode process is reading and writing all the = time. >>>>>>> Any suggestions? >>>>>>>=20 >>>>>>> Thanks. >>>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>> --=20 >>>=20 >>> Adrien Mogenet >>> Head of Backend/Infrastructure >>> adrien.mogenet@contentsquare.com = >>> (+33)6.59.16.64.22 >>> http://www.contentsquare.com >>> 50, avenue Montaigne - 75008 Paris >>=20 >>=20 >>=20 >>=20 >> --=20 >>=20 >> Adrien Mogenet >> Head of Backend/Infrastructure >> adrien.mogenet@contentsquare.com = >> (+33)6.59.16.64.22 >> http://www.contentsquare.com >> 50, avenue Montaigne - 75008 Paris >=20 >=20 >=20 >=20 > --=20 >=20 > Adrien Mogenet > Head of Backend/Infrastructure > adrien.mogenet@contentsquare.com = > (+33)6.59.16.64.22 > http://www.contentsquare.com > 50, avenue Montaigne - 75008 Paris --Apple-Mail=_6A17F28C-DF34-4FDE-8D82-2B37E0E3BB46 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 While looking into this problem, I found that I have = large dncp_block_verification.log.curr and dncp_block_verif= ication.log.prev files.
They are 294G each in the node = which has high IOWAIT, even when the cluster was almost idle.
While the others have 0 for dncp_block_verification.log.curr, = and <15G for dncp_block_verification.log.prev.

Thanks.

On 04 Sep 2015, at 11:56, Adrien Mogenet <adrien.mogenet@contentsquare.com> wrote:

What is your disk configuration? JBOD? If RAID, possibly a = dysfunctional RAID controller, or a constantly-rebuilding array.

Do you have any idea at = which files are linked the read blocks?

On 4 = September 2015 at 11:02, Akmal Abbasov <akmal.abbasov@icloud.com> wrote:
Hi Adrien,
for = the last 24 hours all RS are up and running. There was no region = transitions.
The overall cluster iowait has = decreased, but still 2 RS have very high iowait, while there is no load = on the cluster.
My assumption with the hight number = of HDFS_READ/HDFS_WRITE in RS logs have failed, since all RS have almost = identical number
of HDFS_READ/HDFS_WRITE, while = only 2 of them has high iowait.
According to iotop = the process which is doing most IO is datanode, and it is reading = constantly.
Why datanode could require reading from = disk constantly?
Any ideas?

Thanks.

On 03 Sep 2015, at 18:57, = Adrien Mogenet <adrien.mogenet@contentsquare.com> = wrote:

Is = the uptime of RS "normal"? No quick and global reboot that could lead = into a regiongi-reallocation-storm?

On 3 September 2015 at 18:42, = Akmal Abbasov <akmal.abbasov@icloud.com> wrote:
Hi Adrien,
I=E2=80= =99ve tried to run hdfs fsck and hbase hbck, and hdfs is healthy, also = hbase is consistent.
I=E2=80=99m using default = value of the replication, so it is 3.
There are = some under replicated 
HBase master(node = 10.10.8.55) is reading constantly from regionservers. Only today, it = send >150.000 HDFS_READ requests to each regionserver so far, while = the hbase cluster is almost idle.
What could cause = this kind of behaviour?

p.s. each node in the cluster have 2 core, 4 gb ram, just in = case.

Thanks.


On 03 = Sep 2015, at 17:46, Adrien Mogenet <adrien.mogenet@contentsquare.com> wrote:

Is your HDFS = healthy (fsck /)?

Same= for hbase hbck?

What's your replication = level?

Can you = see constant network use as well?

Anything than might be triggered by the = hbasemaster? (something like a virtually dead RS, due to ZK = race-condition, etc.)

Your 3-weeks-ago balancer shouldn't have any effect if you've = ran a major compaction, successfully, = yesterday.

On 3 September 2015 at 16:32, = Akmal Abbasov <akmal.abbasov@icloud.com> wrote:
I=E2=80=99ve started HDFS = balancer, but then stopped it immediately after knowing that it is not a = good idea.
but it was around 3 weeks ago, is it possible = that it had an influence on the cluster behaviour I=E2=80=99m having = now?
Thanks.

On 03 Sep 2015, at 14:23, Akmal = Abbasov <akmal.abbasov@icloud.com> wrote:

Hi Ted,
No there is no = short-circuit read configured.
The logs of datanode = of the 10.10.8.55 are full of following messages
2015-09-03 12:03:56,324 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 77, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: = BP-439084760-10.32.0.180-1387281790961:blk_1075349331_1612273, duration: = 276448307
2015-09-03 12:03:56,494 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 538, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: = BP-439084760-10.32.0.180-1387281790961:blk_1075349334_1612276, duration: = 60550244
2015-09-03 12:03:59,561 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.53:58622, bytes: 455, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_-483065515_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848214397be, blockid: = BP-439084760-10.32.0.180-1387281790961:blk_1075351814_1614757, duration: = 755613819
There are >100.000 of them just = for today. The situation with other regionservers are similar.
Node 10.10.8.53 is hbase-master node, and the process on the = port is also hbase-master.
So if there is no load = on the cluster, why there are so much IO happening?
Any thoughts.
Thanks.

On 02 Sep 2015, at 21:57, Ted Yu <yuzhihong@gmail.com> wrote:

I assume you have enabled = short-circuit read.

Can you capture region server stack trace(s) and pastebin = them ?

Thanks

On Wed, Sep 2, 2015 at 12:11 PM, = Akmal Abbasov <akmal.abbasov@icloud.com> wrote:
Hi = Ted,
I=E2=80=99ve checked the time when addresses = were changed, and this strange behaviour started weeks before = it.

yes, = 10.10.8.55 is region server and 10.10.8.54 is a hbase master.
any thoughts?

Thanks

On 02 Sep 2015, at 18:45, Ted = Yu <yuzhihong@gmail.com> wrote:

bq. change the ip = addresses of the cluster nodes

Did this happen = recently ? If high iowait was observed after the change (you can look at = ganglia graph), there is a chance that the change was = related.

BTW I = assume 10.10.8.55 is where your region server = resides.

Cheers

On Wed, = Sep 2, 2015 at 9:39 AM, Akmal Abbasov <akmal.abbasov@icloud.com> wrote:
Hi Ted,
sorry = forget to mention

release of hbase / hadoop you're = using
hbase hbase-0.98.7-hadoop2, = hadoop hadoop-2.5.1

were region servers doing = compaction ?
I=E2=80=99ve run major = compactions manually earlier today, but it seems that they already = completed, looking at the compactionQueueSize.

have you checked region server logs = ?
The logs of datanode is full of this = kind of messages
2015-09-02 16:37:06,950 INFO = org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.10.8.55:50010, dest: /10.10.8.54:32959, bytes: 19673, op: HDFS_READ, cliID: = DFSClient_NONMAPREDUCE_1225374853_1, offset: 0, srvID: = ee7d0634-89a3-4ada-a8ad-7848217327be, blockid: = BP-329084760-10.32.0.180-1387281790961:blk_1075277914_1540222, duration: = 7881815

p.s. = we had to change the ip addresses of the cluster nodes, is it = relevant?

Thanks.

On 02 = Sep 2015, at 18:20, Ted Yu <yuzhihong@gmail.com> wrote:

Please provide = some more information:

release of hbase / hadoop you're using
were region servers doing compaction ?
have you checked region server logs ?

Thanks

On Wed, = Sep 2, 2015 at 9:11 AM, Akmal Abbasov <akmal.abbasov@icloud.com> wrote:
Hi,
I=E2=80=99m having strange behaviour in hbase cluster. It is almost = idle, only <5 puts and gets.
But the data in hdfs is increasing, and region servers have very high = iowait(>100, in 2 core CPU).
iotop shows that datanode process is reading and writing all the = time.
Any suggestions?

Thanks.









--

Adrien = Mogenet
Head of Backend/Infrastructure
50, avenue Montaigne - = 75008 = Paris




--

Adrien = Mogenet
Head of Backend/Infrastructure
50, avenue Montaigne - = 75008 = Paris




--

Adrien = Mogenet
Head of Backend/Infrastructure
(+33)6.59.16.64.22
50, avenue Montaigne - = 75008 = Paris

= --Apple-Mail=_6A17F28C-DF34-4FDE-8D82-2B37E0E3BB46--