Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 01705C69A for ; Fri, 14 Jun 2013 11:09:37 +0000 (UTC) Received: (qmail 31903 invoked by uid 500); 14 Jun 2013 11:09:31 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 31439 invoked by uid 500); 14 Jun 2013 11:09:30 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 31432 invoked by uid 99); 14 Jun 2013 11:09:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jun 2013 11:09:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mail2mayank@gmail.com designates 74.125.82.169 as permitted sender) Received: from [74.125.82.169] (HELO mail-we0-f169.google.com) (74.125.82.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jun 2013 11:09:22 +0000 Received: by mail-we0-f169.google.com with SMTP id n57so375083wev.28 for ; Fri, 14 Jun 2013 04:09:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=xhbiFXTQmdkJqJtgOSJFWgHRcV6dK3BH1wBzb3Y6uw4=; b=sSof/xRhjOeytd8S1on3Hi7uShqj3GyYKsnEvVftbp6COj2w6xBp0YVLOoaJaeufZo 43z2RX2u+m9Qspt4XPzHKknSuxteOo1wljTDphxVqWfCHWU7h7ko4HbtCBbJQZnU6aGd jnlveX695EjBo6CQ/0RCYy5N7Hzu4TOd4TmF4/4GFbEFv+YksCMKC6rSOOqCcd2LtMAE Lw72esdFiE1SvIH1nS8I0HJqrH5M2BlObtDv+sAteRL1DY1XesiP0BQViVKrTENoPToB m9afoJmBViBwyFjLTIaHpxUnmOuJo69KZND7FyhyBkHCYMhL+mSTP+Ya8tSh7sUxKV2e EycQ== MIME-Version: 1.0 X-Received: by 10.194.179.233 with SMTP id dj9mr1099560wjc.46.1371208142631; Fri, 14 Jun 2013 04:09:02 -0700 (PDT) Received: by 10.194.173.162 with HTTP; Fri, 14 Jun 2013 04:09:02 -0700 (PDT) In-Reply-To: References: Date: Fri, 14 Jun 2013 16:39:02 +0530 Message-ID: Subject: Re: Application errors with one disk on datanode getting filled up to 100% From: Mayank To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7bae49724139bc04df1b46e8 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bae49724139bc04df1b46e8 Content-Type: text/plain; charset=UTF-8 No, as of this moment we've no ideas about the reasons for that behavior. On Fri, Jun 14, 2013 at 4:04 PM, Rahul Bhattacharjee < rahul.rec.dgp@gmail.com> wrote: > Thanks Mayank, Any clue on why was only one disk was getting all writes. > > Rahul > > > On Thu, Jun 13, 2013 at 11:47 AM, Mayank wrote: > >> So we did a manual rebalance (followed instructions at: >> http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the_disk.3F) >> and also reserved 30 GB of space for non dfs usage via >> dfs.datanode.du.reserved and restarted our apps. >> >> Things have been going fine till now. >> >> Keeping fingers crossed :) >> >> >> On Wed, Jun 12, 2013 at 12:58 PM, Rahul Bhattacharjee < >> rahul.rec.dgp@gmail.com> wrote: >> >>> I have a few points to make , these may not be very helpful for the said >>> problem. >>> >>> +All data nodes are bad exception is kind of not pointing to the problem >>> related to disk space full. >>> +hadoop.tmp.dir acts as base location of other hadoop related properties >>> , not sure if any particular directory is created specifically. >>> +Only one disk getting filled looks strange.The other disk are part >>> while formatting the NN. >>> >>> Would be interesting to know the reason for this. >>> Please keep posted. >>> >>> Thanks, >>> Rahul >>> >>> >>> On Mon, Jun 10, 2013 at 3:39 PM, Nitin Pawar wrote: >>> >>>> From the snapshot, you got around 3TB for writing data. >>>> >>>> Can you check individual datanode's storage health. >>>> As you said you got 80 servers writing parallely to hdfs, I am not sure >>>> can that be an issue. >>>> As suggested in past threads, you can do a rebalance of the blocks but >>>> that will take some time to finish and will not solve your issue right >>>> away. >>>> >>>> You can wait for others to reply. I am sure there will be far better >>>> solutions from experts for this. >>>> >>>> >>>> On Mon, Jun 10, 2013 at 3:18 PM, Mayank wrote: >>>> >>>>> No it's not a map-reduce job. We've a java app running on around 80 >>>>> machines which writes to hdfs. The error that I'd mentioned is being thrown >>>>> by the application and yes we've replication factor set to 3 and following >>>>> is status of hdfs: >>>>> >>>>> Configured Capacity : 16.15 TB DFS Used : 11.84 TB Non DFS Used :872.66 GB DFS >>>>> Remaining : 3.46 TB DFS Used% : 73.3 % DFS Remaining% : 21.42 % Live >>>>> Nodes :10 Dead >>>>> Nodes >>>>> : 0 Decommissioning Nodes >>>>> : 0 Number of Under-Replicated Blocks : 0 >>>>> >>>>> >>>>> On Mon, Jun 10, 2013 at 3:11 PM, Nitin Pawar wrote: >>>>> >>>>>> when you say application errors out .. does that mean your mapreduce >>>>>> job is erroring? In that case apart from hdfs space you will need to look >>>>>> at mapred tmp directory space as well. >>>>>> >>>>>> you got 400GB * 4 * 10 = 16TB of disk and lets assume that you have a >>>>>> replication factor of 3 so at max you will have datasize of 5TB with you. >>>>>> I am also assuming you are not scheduling your program to run on >>>>>> entire 5TB with just 10 nodes. >>>>>> >>>>>> i suspect your clusters mapred tmp space is getting filled in while >>>>>> the job is running. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Jun 10, 2013 at 3:06 PM, Mayank wrote: >>>>>> >>>>>>> We are running a hadoop cluster with 10 datanodes and a namenode. >>>>>>> Each datanode is setup with 4 disks (/data1, /data2, /data3, /data4), which >>>>>>> each disk having a capacity 414GB. >>>>>>> >>>>>>> >>>>>>> hdfs-site.xml has following property set: >>>>>>> >>>>>>> >>>>>>> dfs.data.dir >>>>>>> >>>>>>> /data1/hadoopfs,/data2/hadoopfs,/data3/hadoopfs,/data4/hadoopfs >>>>>>> Data dirs for DFS. >>>>>>> >>>>>>> >>>>>>> Now we are facing a issue where in we find /data1 getting filled up >>>>>>> quickly and many a times we see it's usage running at 100% with just few >>>>>>> megabytes of free space. This issue is visible on 7 out of 10 datanodes at >>>>>>> present. >>>>>>> >>>>>>> We've some java applications which are writing to hdfs and many a >>>>>>> times we are seeing foloowing errors in our application logs: >>>>>>> >>>>>>> >>>>>>> >>>>>>> java.io.IOException: All datanodes xxx.xxx.xxx.xxx:50010 are bad. Aborting... >>>>>>> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3093) >>>>>>> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2586) >>>>>>> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2790) >>>>>>> >>>>>>> >>>>>>> >>>>>>> I went through some old discussions and looks like manual >>>>>>> rebalancing is what is required in this case and we should also have >>>>>>> dfs.datanode.du.reserved set up. >>>>>>> >>>>>>> However I'd like to understand if this issue, with one disk getting >>>>>>> filled up to 100% can result into the issue which we are seeing in our >>>>>>> application. >>>>>>> >>>>>>> Also, are there any other peformance implications due to some of the >>>>>>> disks running at 100% usage on a datanode. >>>>>>> -- >>>>>>> Mayank Joshi >>>>>>> >>>>>>> Skype: mail2mayank >>>>>>> Mb.: +91 8690625808 >>>>>>> >>>>>>> Blog: http://www.techynfreesouls.co.nr >>>>>>> PhotoStream: http://picasaweb.google.com/mail2mayank >>>>>>> >>>>>>> Today is tommorrow I was so worried about yesterday ... >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nitin Pawar >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Mayank Joshi >>>>> >>>>> Skype: mail2mayank >>>>> Mb.: +91 8690625808 >>>>> >>>>> Blog: http://www.techynfreesouls.co.nr >>>>> PhotoStream: http://picasaweb.google.com/mail2mayank >>>>> >>>>> Today is tommorrow I was so worried about yesterday ... >>>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >>> >> >> >> -- >> Mayank Joshi >> >> Skype: mail2mayank >> Mb.: +91 8690625808 >> >> Blog: http://www.techynfreesouls.co.nr >> PhotoStream: http://picasaweb.google.com/mail2mayank >> >> Today is tommorrow I was so worried about yesterday ... >> > > -- Mayank Joshi Skype: mail2mayank Mb.: +91 8690625808 Blog: http://www.techynfreesouls.co.nr PhotoStream: http://picasaweb.google.com/mail2mayank Today is tommorrow I was so worried about yesterday ... --047d7bae49724139bc04df1b46e8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
No, as of this moment we've no ideas about the reasons= for that behavior.


On Fri, Jun 14, 2013 at 4:04 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com> wrote:
Thanks Mayank, Any clue on why was on= ly one disk was getting all writes.

Rahul
=


On Thu, Jun 1= 3, 2013 at 11:47 AM, Mayank <mail2mayank@gmail.com> wrot= e:
So we did a manual rebalanc= e (followed instructions at: http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_n= ode.2C_how_do_you_balance_the_blocks_on_the_disk.3F) and also reserved = 30 GB of space for non dfs usage via dfs.datanode.du.reserved and restarted= our apps.

Things have been going fine till now.

Keeping fingers crossed := )


On Wed, Jun 12, 2013 at 12:58 PM, Rahul Bhattacharjee <rahul= .rec.dgp@gmail.com> wrote:
I have a few points to make , these m= ay not be very helpful for the said problem.

+All data nodes are bad exception is kind of not pointing to the problem re= lated to disk space full.
+hadoop.tmp.dir acts as base location of other had= oop related properties , not sure if any particular directory is created sp= ecifically.
+Onl= y one disk getting filled looks strange.The other disk are part while forma= tting the NN.

Would be interesting to know the reason for this.
Please keep posted.
Thanks,
Rahul

On Mon, Jun 10, 2013 at 3:39 PM, Nitin Pawar <= span dir=3D"ltr"><nitinpawar432@gmail.com> wrote:
From the snapshot, you got = around 3TB for writing data.=C2=A0

Can you check individ= ual datanode's storage health.=C2=A0
As you said you got 80 servers writing parallely to hdfs, I am not sur= e can that be an issue.=C2=A0
As suggested in past threads, you can do a rebalance of the blocks but= that will take some time to finish and will not solve your issue right awa= y.=C2=A0

You can wait for others to reply. I am su= re there will be far better solutions from experts for this.=C2=A0


On Mon, Jun 10, 2013 at 3:18 PM, Mayank <mail2mayank@gmail.com&= gt; wrote:
No it's not a map-reduc= e job. We've a java app running on around 80 machines which writes to h= dfs. The error that I'd mentioned is being thrown by the application an= d yes we've replication factor set to 3 and following is status of hdfs= :

21.42 %<= /table>


On Mon, Jun 10, 2013 at 3:11 PM, Nitin Pawar <nitinpaw= ar432@gmail.com> wrote:
when you say application er= rors out .. does that mean your mapreduce job is erroring? In that case apa= rt from hdfs space you will need to look at mapred tmp directory space as w= ell.=C2=A0

you got 400GB * 4 * 10 =3D 16TB of disk and lets assume that you have a rep= lication factor of 3 so at max you will have datasize of 5TB with you.=C2= =A0
I am also assuming you are not scheduling your program to run= on entire 5TB with just 10 nodes.=C2=A0

i suspect your clusters mapred tmp space is getting fil= led in while the job is running.=C2=A0


<= div>


On Mon, Jun 10, 2013 at 3:06 PM, Mayank <mail2mayank@gmail.com> wrote:
We are running a hadoop cluster with 10 datanodes and= a namenode. Each datanode is setup with 4 disks (/data1, /data2, /data3, /= data4), which each disk having a capacity 414GB.


hdfs= -site.xml has following property set:

<property>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <name= >dfs.data.dir</name>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= <value>/data1/hadoopfs,/data2/hadoopfs,/data3/hadoopfs,/data4/hadoop= fs</value>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <descript= ion>Data dirs for DFS.</description>
</property>

Now we are facing a iss= ue where in we find /data1 getting filled up quickly and many a times we se= e it's usage running at 100% with just few megabytes of free space. Thi= s issue is visible on 7 out of 10 datanodes at present.

We've some java applications which are writing to hdfs a= nd many a times we are seeing foloowing errors in our application logs:
=


java.io.IOException: All datanodes xxx.xxx.xxx.xxx:50010 are bad. Aborting...
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(D=
FSClient.java:3093)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.=
java:2586)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCl=
ient.java:2790)



I went through some old discussions and looks like manual rebalan= cing is what is required in this case and we should also have dfs.datanode.= du.reserved set up.

However I'd like to understand if= this issue, with one disk getting filled up to 100% can result into the is= sue which we are seeing in our application.

Also, are there any other peformance implications due to some of the di= sks running at 100% usage on a datanode.
<= /font>
--
Mayank Joshi

Skype: mail2mayank
Mb.:=C2=A0 +91 86906258= 08

Blog: http://www.techynfreesouls.co.nr
PhotoStream: http://picasaweb.google.com/mail2mayank

Today is tommorr= ow I was so worried about yesterday ...



<= font color=3D"#888888">--
Nitin Pawar



--
Mayank Joshi

Sky= pe: mail2mayank
Mb.:=C2=A0 +91 8690625808

Blog: http://www.techynfree= souls.co.nr
PhotoStream: http://picasaweb.google.com/mail2mayank

Today is tommorr= ow I was so worried about yesterday ...



<= /div>--
Nitin Pawar




--
Mayank Josh= i

Skype: mail2mayank
Mb.:=C2=A0 +91 8690625808

Blog= : http://ww= w.techynfreesouls.co.nr
PhotoStream: http://picasaweb.google.com/mail2mayank

Today is tommorr= ow I was so worried about yesterday ...



--
Mayank Josh= i

Skype: mail2mayank
Mb.:=C2=A0 +91 8690625808

Blog: http://www.tech= ynfreesouls.co.nr
PhotoStream: http://picasaweb.google.com/mail2mayank

Today is tommorr= ow I was so worried about yesterday ... --047d7bae49724139bc04df1b46e8--
Configured Capacity : 16.15 TB=
DFS Used : 11.84 TB
Non DFS Used : 872.66 GB
DFS Rem= aining : 3.46 TB
DFS Used% : 73.3 %
DFS Remaining% :
Live Nodes : 10
Dead Nod= es : 0
Decommissioning Node= s : 0
Number of Under-Replicated Blocks : 0