hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: Very high CPU usage on data nodes because of FSDataset.checkDataDir() on every connect
Date Wed, 28 Mar 2007 18:27:11 GMT
hairong Kuang wrote:
> I agree that it is too expensive to call checkDir for every I/O operation.
> But checkDir is usefully in the case that a disk suddenly became only
> readable. We saw this happened before. But we definitely should revisit it
> since now a datanode is able to manage multiple data directories and each
> data directory maitain multiple levels.
> Hairong

For hard disks and native filesystems any one of many many things can go 
wrong. We can not realistically check all and I don't think this 
perticular failure is any more probable than others. This should be 
handled just like any other issue we handle at datanode. DFS is designed 
to deal with flacky datanodes.


View raw message