hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohith Sharma K S <rohithsharm...@huawei.com>
Subject RE: Reduce fails always
Date Mon, 06 Oct 2014 10:52:21 GMT
Hi

How much data does wordcount job is processing?
What is the disk space ("df -h" ) available in the node where it always fail?

The point I didn't understand is why it uses only one datanode disc space?
>>  For reducers task running, containers can be allocated at any node. I think, in
your cluster one of the machines disk space is very low. So  whichever the task running on
that particular  node is failing.


Thanks & Regards
Rohith Sharma K S


From: Abdul Navaz [mailto:navaz.enc@gmail.com]
Sent: 06 October 2014 08:21
To: user@hadoop.apache.org
Subject: Reduce fails always

Hi All,

I am running sample word count job  in a  9 node cluster and I am getting the below error
message.


hadoop jar chiu-wordcount2.jar WordCount /user/hduser/getty/file1.txt /user/hduser/getty/out10
-D mapred.reduce.tasks=2

14/10/05 18:08:45 INFO mapred.JobClient:  map 99% reduce 26%

14/10/05 18:08:48 INFO mapred.JobClient:  map 99% reduce 28%

14/10/05 18:08:51 INFO mapred.JobClient:  map 100% reduce 28%

14/10/05 18:08:57 INFO mapred.JobClient:  map 98% reduce 0%

14/10/05 18:08:58 INFO mapred.JobClient: Task Id : attempt_201410051754_0003_r_000000_0, Status
: FAILED

FSError: java.io.IOException: No space left on device

14/10/05 18:08:59 WARN mapred.JobClient: Error reading task outputhttp://pcvm1-10.utahddc.geniracks.net:50060/tasklog?plaintext=true&attemptid=attempt_201410051754_0003_r_000000_0&filter=stdout

14/10/05 18:08:59 WARN mapred.JobClient: Error reading task outputhttp://pcvm1-10.utahddc.geniracks.net:50060/tasklog?plaintext=true&attemptid=attempt_201410051754_0003_r_000000_0&filter=stderr

14/10/05 18:08:59 INFO mapred.JobClient: Task Id : attempt_201410051754_0003_m_000015_0, Status
: FAILED

FSError: java.io.IOException: No space left on device

14/10/05 18:09:02 INFO mapred.JobClient:  map 99% reduce 0%

14/10/05 18:09:07 INFO mapred.JobClient:  map 99% reduce 1%


I can see it uses all disk space on one of the datanode when shuffling starts.  As soon as
disc space on the node becomes nill it throws me this error and job aborts. The point I didn't
understand is why it uses only one datanode disc space.  I have change the number of reducer
as 4 still it uses only one datanode disc and throws above error.


How can I fix this issue?


Thanks & Regards,

Navaz



Mime
View raw message