hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Young Kim <juneng...@gmail.com>
Subject Re: so many failures on reducers.
Date Tue, 03 May 2011 02:17:14 GMT
I am sure that the hadoop user is identical with me :0.

my job took too long time to complete it.
this is the problem what I have now.

so many failure on a reduce step.
hadoop want to complete it anyway.
go to retry
and failed
go to retry
and failed
...
...

finally, I need to wait about 1 or 2 hours to see "SUCCESS" of my job.

thanks james.

Junyoung Kim (juneng603@gmail.com)


On 05/03/2011 10:56 AM, James Seigel wrote:
> Is mapreduce running as the hadoop user?  If so it can’t erase the files in tmp.  Which
might be causing you some hilarity
>
> :)
>
> J
>
>
> On 2011-05-02, at 7:43 PM, Jun Young Kim wrote:
>
>> Hi,
>>
>> To James.
>>
>> this is the permission of hadoop.tmp.dir.
>>
>> $>  ls -al
>> drwxr-xr-x 6 juneng juneng 4096 5월 3 10:37 hadoop-juneng.2
>>
>>
>> To Harsh.
>> yes, out cluster has 96 occupied reducer slots.
>> and my job is using 90 reduce tasks at one time to complete it.
>>
>> thanks for all.
>>
>> Junyoung Kim (juneng603@gmail.com)
>>
>>
>> On 05/02/2011 08:32 PM, James Seigel wrote:
>>> What are your permissions on your hadoop.tmp.dir ?
>>>
>>> James
>>>
>>> Sent from my mobile. Please excuse the typos.
>>>
>>> On 2011-05-02, at 1:26 AM, Jun Young Kim<juneng603@gmail.com>   wrote:
>>>
>>>> hi, all.
>>>>
>>>> I got so many failures on a reducing step.
>>>>
>>>> see this error.
>>>>
>>>> java.io.IOException: Failed to delete earlier output of task: attempt_201105021341_0021_r_000001_0
>>>>     at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:157)
>>>>     at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:173)
>>>>     at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:173)
>>>>     at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:133)
>>>>     at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:233)
>>>>     at org.apache.hadoop.mapred.Task.commit(Task.java:962)
>>>>     at org.apache.hadoop.mapred.Task.done(Task.java:824)
>>>>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
>>>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
>>>>     at org.apache.hadoop.mapred.C
>>>>
>>>>
>>>> this error was happened after adopting MultipleTextOutputFormat class in
my job.
>>>> the job is producing thousands of different output files on a HDFS.
>>>>
>>>> anybody can guess reasons?
>>>>
>>>> thanks.
>>>>
>>>> --
>>>> Junyoung Kim (juneng603@gmail.com)
>>>>

Mime
View raw message