hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: The reduce copier failed
Date Thu, 20 Mar 2014 08:46:48 GMT
At the end it says clearly that the job has failed.

On Thu, Mar 20, 2014 at 12:49 PM, Mahmood Naderan <nt_mahmood@yahoo.com> wrote:
> After multiple messages, it says that the job has been completed. I really
> wonder if the job has been truly completed or failed.
>
> 14/03/20 03:49:04 INFO mapred.JobClient:  map 50% reduce 0%
> 14/03/20 03:49:20 INFO mapred.JobClient: Job complete: job_201403191916_0001
> 14/03/20 03:49:20 INFO mapred.JobClient: Counters: 20
> 14/03/20 03:49:20 INFO mapred.JobClient:   Job Counters
> 14/03/20 03:49:20 INFO mapred.JobClient:     Launched reduce tasks=4
> 14/03/20 03:49:20 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=121826447
> 14/03/20 03:49:20 INFO mapred.JobClient:     Total time spent by all reduces
> waiting after reserving slots (ms)=0
> 14/03/20 03:49:20 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/03/20 03:49:20 INFO mapred.JobClient:     Launched map tasks=357
> 14/03/20 03:49:20 INFO mapred.JobClient:     Data-local map tasks=357
> 14/03/20 03:49:20 INFO mapred.JobClient:     Failed reduce tasks=1
> 14/03/20 03:49:20 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=27097157
> 14/03/20 03:49:20 INFO mapred.JobClient:   FileSystemCounters
> 14/03/20 03:49:20 INFO mapred.JobClient:     HDFS_BYTES_READ=23648804348
> 14/03/20 03:49:20 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=4320784806
> 14/03/20 03:49:20 INFO mapred.JobClient:   File Input Format Counters
> 14/03/20 03:49:20 INFO mapred.JobClient:     Bytes Read=23648753804
> 14/03/20 03:49:20 INFO mapred.JobClient:   Map-Reduce Framework
> 14/03/20 03:49:20 INFO mapred.JobClient:     Map output materialized
> bytes=4300573634
> 14/03/20 03:49:20 INFO mapred.JobClient:     Combine output records=0
> 14/03/20 03:49:20 INFO mapred.JobClient:     Map input records=7131117
> 14/03/20 03:49:20 INFO mapred.JobClient:     Spilled Records=903190
> 14/03/20 03:49:20 INFO mapred.JobClient:     Map output bytes=4296978520
> 14/03/20 03:49:20 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=62965284864
> 14/03/20 03:49:20 INFO mapred.JobClient:     Combine input records=0
> 14/03/20 03:49:20 INFO mapred.JobClient:     Map output records=903190
> 14/03/20 03:49:20 INFO mapred.JobClient:     SPLIT_RAW_BYTES=45981
> Exception in thread "main" java.lang.IllegalStateException: Job failed!
>     at
> org.apache.mahout.text.wikipedia.WikipediaDatasetCreatorDriver.runJob(WikipediaDatasetCreatorDriver.java:187)
>     at
> org.apache.mahout.text.wikipedia.WikipediaDatasetCreatorDriver.main(WikipediaDatasetCreatorDriver.java:115)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>     at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:601)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
>
>
> Regards,
> Mahmood
>
>
> On Thursday, March 20, 2014 3:41 AM, Harsh J <harsh@cloudera.com> wrote:
> While it does mean a retry, if the job eventually fails (after finite
> retries all fail as well), then you have a problem to investigate. If
> the job eventually succeeded, then this may have been a transient
> issue. Worth investigating either way.
>
> On Thu, Mar 20, 2014 at 12:57 AM, Mahmood Naderan <nt_mahmood@yahoo.com>
> wrote:
>> Hi
>> In the middle of a map-reduce job I get
>>
>> map 20% reduce 6%
>> ...
>> The reduce copier failed
>> ....
>> map 20% reduce 0%
>> map 20% reduce 1%
>> map 20% reduce 2%
>> map 20% reduce 3%
>>
>>
>> Does that imply a *retry* process? Or I have to be worried about that
>> message?
>>
>> Regards,
>> Mahmood
>
>
>
>
> --
> Harsh J
>



-- 
Harsh J

Mime
View raw message