hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulul <had...@ulul.org>
Subject Re: Reduce fails always
Date Mon, 06 Oct 2014 21:22:04 GMT
Hello

Did you check you don't have a job.setNumReduceTasks(1); in your job 
driver ?
And you should check the number of slots available on the jobtracker web 
interface

Ulul

Le 06/10/2014 20:34, Abdul Navaz a écrit :
> Hello,
>
> I have 8 Datanodes and each having storage capacity of only 3GB.  I am 
> running word count on 1GB of text file.
>
> Initially df –h shows it has 2.8GB after HDFS write. When Shuffling 
> Starts it goes on consuming the disc space of only one node. I think 
> it is the reducer. Finally df –h shows 2MB. Why can’t it just use all 
> 4 reducer disc space ?
>
>
>
> Thanks & Regards,
>
> Abdul Navaz
> Research Assistant
> University of Houston Main Campus, Houston TX
> Ph: 281-685-0388
>
>
> From: Rohith Sharma K S <rohithsharmaks@huawei.com 
> <mailto:rohithsharmaks@huawei.com>>
> Reply-To: <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Date: Monday, October 6, 2014 at 5:52 AM
> To: "user@hadoop.apache.org <mailto:user@hadoop.apache.org>" 
> <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Subject: RE: Reduce fails always
>
> Hi
>
> How much data does wordcount job is processing?
>
> What is the disk space (“df -h” ) available in the node where it 
> always fail?
>
> The point I didn’t understand is why it uses only one datanode disc space?
>
> >>  For reducers task running, containers can be allocated at any node. I think,
in your cluster one of the 
> machines disk space is very low. So whichever the task running on that 
> particular  node is failing.
>
> Thanks & Regards
>
> Rohith Sharma K S
>
> *From:*Abdul Navaz [mailto:navaz.enc@gmail.com]
> *Sent:* 06 October 2014 08:21
> *To:* user@hadoop.apache.org <mailto:user@hadoop.apache.org>
> *Subject:* Reduce fails always
>
> Hi All,
>
> I am running sample word count job  in a  9 node cluster and I am 
> getting the below error message.
>
> hadoop jar chiu-wordcount2.jar WordCount /user/hduser/getty/file1.txt 
> /user/hduser/getty/out10 -D mapred.reduce.tasks=2
>
> 14/10/05 18:08:45 INFO mapred.JobClient:  map 99% reduce 26%
>
> 14/10/05 18:08:48 INFO mapred.JobClient:  map 99% reduce 28%
>
> 14/10/05 18:08:51 INFO mapred.JobClient:  map 100% reduce 28%
>
> 14/10/05 18:08:57 INFO mapred.JobClient:  map 98% reduce 0%
>
> 14/10/05 18:08:58 INFO mapred.JobClient: Task Id : 
> attempt_201410051754_0003_r_000000_0, Status : FAILED
>
> FSError: java.io.IOException: No space left on device
>
> 14/10/05 18:08:59 WARN mapred.JobClient: Error reading task 
> outputhttp://pcvm1-10.utahddc.geniracks.net:50060/tasklog?plaintext=true&attemptid=attempt_201410051754_0003_r_000000_0&filter=stdout
>
> 14/10/05 18:08:59 WARN mapred.JobClient: Error reading task 
> outputhttp://pcvm1-10.utahddc.geniracks.net:50060/tasklog?plaintext=true&attemptid=attempt_201410051754_0003_r_000000_0&filter=stderr
>
> 14/10/05 18:08:59 INFO mapred.JobClient: Task Id : 
> attempt_201410051754_0003_m_000015_0, Status : FAILED
>
> FSError: java.io.IOException: No space left on device
>
> 14/10/05 18:09:02 INFO mapred.JobClient:  map 99% reduce 0%
>
> 14/10/05 18:09:07 INFO mapred.JobClient:  map 99% reduce 1%
>
> I can see it uses all disk space on one of the datanode when shuffling 
> starts.  As soon as disc space on the node becomes nill it throws me 
> this error and job aborts. The point I didn’t understand is why it 
> uses only one datanode disc space.  I have change the number of 
> reducer as 4 still it uses only one datanode disc and throws above error.
>
> How can I fix this issue?
>
> Thanks & Regards,
>
> Navaz
>


Mime
View raw message