hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: 0.18.1 datanode psuedo deadlock problem
Date Mon, 12 Jan 2009 19:44:34 GMT
Sagar Naik wrote:
> Hi Raghu,
> 
> 
> The periodic "du" and block reports thread thrash the disk. (Block 
> Reports takes abt on an avg 21 mins )
> 
> and I think all the datanode threads are not able to do much and freeze

yes, that is the known problem we talked about in the earlier mails in 
this thread.

When you have millions of blocks, one hour for du and block report 
intervals is too often for you. May be you could increase it to 
something like 6 or 12 hours.

It still does not fix the block report problem since DataNode does the 
scan in-line.

As I mentioned in earlier mails, we should really fix the block report 
problem. As simple fix would scan (very slowly, unlike DU) the 
directories in the background.

Even after fixing block reports, you should be aware that excessive 
number of block does impact the performance. No system can guarantee 
performance when overloaded. What we want to do is to make Hadoop 
degrade gracefully.. rather than DNs getting killed.

Raghu.

Mime
View raw message