hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddharth Tiwari <siddharth.tiw...@live.com>
Subject RE: Huge disk IO on only one disk
Date Mon, 03 Mar 2014 05:50:09 GMT
Hi Brahma,
No I havnt, I have put comma separated list of disks here dfs.datanode.data.dir . Have put
disk5 for hadoop.tmp.dir. My Q is, should we set up hadoop.tmp.dir or not ? if yes what should
be standards around.

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"


From: brahmareddy.battula@huawei.com
To: user@hadoop.apache.org
Subject: RE: Huge disk IO on only one disk
Date: Mon, 3 Mar 2014 05:14:34 +0000








 

 

Seems to be you had started cluster with default values for the following two properties and
configured for only hadoop.tmp.dir .

 

dfs.datanode.data.dir --->  file://${hadoop.tmp.dir}/dfs/data (Default value)

 

>>>>Determines where on the local filesystem an DFS data node should store its
blocks. If this is a comma-delimited list of directories, then data will be stored in all
named directories, typically on different devices

 

yarn.nodemanager.local-dirs -->  ${hadoop.tmp.dir}/nm-local-dir (Default value)

 

>>>>>>To store localized files, It's like inetermediate files

 

 

Please configure above two values as muliple dir's..


 

 

 


Thanks & Regards 

Brahma Reddy Battula

 





From: Siddharth Tiwari [siddharth.tiwari@live.com]

Sent: Monday, March 03, 2014 5:58 AM

To: USers Hadoop

Subject: Huge disk IO on only one disk






Hi Team,



I have 10 disks over which I am running my HDFS. Out of this on disk5 I have my hadoop.tmp.dir
configured. I see that on this disk I have huge IO when I run my jobs compared to other disks.
Can you guide my to the standards
 to follow so that this IO can be distributed across to other disks as well. 
What should be the standard around setting up the hadoop.tmp.dir parameter. 
Any help would be highly appreciated. below is IO while I am running a huge job.








Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               2.11        37.65       226.20  313512628 1883809216

sdb               1.47        96.44       152.48  803144582 1269829840

sdc               1.45        93.03       153.10  774765734 1274979080

sdd               1.46        95.06       152.73  791690022 1271944848

sde               1.47        92.70       153.24  772025750 1276195288

sdf               1.55        95.77       153.06  797567654 1274657320

sdg              10.10       364.26      1951.79 3033537062 16254346480

sdi               1.46        94.82       152.98  789646630 1274014936

sdh               1.44        94.09       152.57  783547390 1270598232

sdj               1.44        91.94       153.37  765678470 1277220208

sdk               1.52        97.01       153.02  807928678 1274300360




*------------------------*

Cheers !!!

Siddharth
Tiwari

Have a refreshing day !!!

"Every duty is holy, and devotion to duty is the highest form of worship of God.”


"Maybe other people will try to limit me but I don't limit myself"





 		 	   		  
Mime
View raw message