hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahmood Naderan <nt_mahm...@yahoo.com>
Subject Re: The reduce copier failed
Date Tue, 25 Mar 2014 10:15:18 GMT
Rather than memory problem, it was a disk problem. I made more free spaces and it fixed


On Saturday, March 22, 2014 8:58 PM, Mahmood Naderan <nt_mahmood@yahoo.com> wrote:
Really stuck at this step. I have test with smaller data set and it works. Now I am using
wikipedia articles (46GB) with 600 chunks (each 64MB)

I have set number of mappers and reducers to 1 to ensure consistency and I am running on a
local node. Why reducer doesn't report anything within 600 seconds??

14/03/22 15:00:51 INFO mapred.JobClient:  map 15% reduce 5%
14/03/22 15:18:43 INFO mapred.JobClient:  map 16% reduce 5%
14/03/22 15:46:38 INFO mapred.JobClient: Task Id : attempt_201403212248_0002_m_000118_0, Status
Task attempt_201403212248_0002_m_000118_0 failed to report status for 600 seconds. Killing!
14/03/22 15:48:54 INFO mapred.JobClient:  map 17% reduce 5%
14/03/22 16:06:32 INFO mapred.JobClient:  map 18% reduce 5%
 16:07:08 INFO mapred.JobClient:  map 18% reduce 6%
14/03/22 16:24:09 INFO mapred.JobClient:  map 19% reduce 6%
14/03/22 16:41:58 INFO mapred.JobClient:  map 20% reduce 6%
14/03/22 16:55:13 INFO mapred.JobClient: Task Id : attempt_201403212248_0002_r_000000_0, Status
java.io.IOException: Task: attempt_201403212248_0002_r_000000_0 - The reduce copier failed
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by:
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory
for file:/tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201403212248_0002/attempt_201403212248_0002_r_000000_0/output/map_107.out
    at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
    at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
    at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2690)

attempt_201403212248_0002_r_000000_0: log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapred.Task).
attempt_201403212248_0002_r_000000_0: log4j:WARN Please initialize the log4j system properly.
14/03/22 16:55:15 INFO
 mapred.JobClient:  map 20% reduce 0%
14/03/22 16:55:34 INFO mapred.JobClient:  map 20% reduce 1%


On Saturday, March 22, 2014 10:27 AM, Mahmood Naderan <nt_mahmood@yahoo.com> wrote:
Again I got the same error and it says

The reducer copier failed
could not find any valid local directory for file /tmp/hadoop-hadoop/....map_150.out

Searching the web shows that I have to clean up the /tmp/hadoop-hadoop folder but the total
size of this folder is 800KB with 1100 files. Does that really matter?


On Friday, March 21, 2014 3:52 PM, Mahmood Naderan <nt_mahmood@yahoo.com> wrote:
OK it seems that there was a "free disk space" issue.
I made more spaces and running again.


On Friday, March 21, 2014 11:43 AM, shashwat shriparv <dwivedishashwat@gmail.com> wrote:
​Check if the tmp dir, hdfs remaining or log directory are getting filled up while this
job runs..​

On Fri, Mar 21, 2014 at 12:11 PM, Mahmood Naderan <nt_mahmood@yahoo.com> wrote:

that imply a *retry* process? Or I have to be wo


Warm Regards_∞_
Shashwat Shriparv
View raw message