hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gokulakannan M <gok...@huawei.com>
Subject RE: Tasktracker volume failure...
Date Tue, 26 Oct 2010 12:35:50 GMT

Yes.. This is my scenario..

I have one tasktracker... I configured 10 dirs(volumes)in
mapred.local.dir<each of this is a separate volume mounted..even separate
disks physically>..if one of the volume has bugs<in my case one physical
harddisk is removed manually> , tasktracker is not executing further tasks..


I remember in datanode, a similar scenario is handled.. when one of the
volume fails, it will mark that volume as bad and proceed further..<Ref:
HDFS-457>

Is the similar fault tolerant feature available for tasktracker??? because
only one of the "n" dirs has problem .. but it makes the TT to keep on
retrying for that failed and not executing any tasks...

-----Original Message-----
From: Steve Loughran [mailto:stevel@apache.org] 
Sent: Tuesday, October 26, 2010 3:52 PM
To: common-user@hadoop.apache.org
Subject: Re: Tasktracker volume failure...

On 26/10/10 04:10, Gokulakannan M wrote:
>
>
> Hi,
>
> I faced a problem when a volume configured in *mapred.local.dir* fails,
> the tasktracker continuously trying to create directory
> <checkLocalDirs()> and fails<even the main method throws exception
> periodically due to getFreeSpace() call on the failed volume>.
> Eventually all the running jobs are getting failed and new jobs cannot
> be executed.
>

I think you can provide a list of localdirs, in which case the TT would 
only fail if there is no free local volume with enough space


Mime
View raw message