hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4296) Spasm of JobClient failures on successful jobs every once in a while
Date Thu, 16 Oct 2008 17:51:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640242#action_12640242
] 

dhruba borthakur commented on HADOOP-4296:
------------------------------------------

>make JobClient skip out of the loop when it detects the job as complete

I do not understand Vinod's comment, can you pl explain in greater detail? The client is trying
to determine if the job is complete. if the job is complete, of course the client exits. But
the problem if that the JobClient was unable to fetch the status of the job from the JT

> Spasm of JobClient failures on successful jobs every once in a while
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4296
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4296
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.1
>            Reporter: Joydeep Sen Sarma
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 4296_jt_delayretire.patch, 4296_jt_delayretire2.patch
>
>
> At very busy times - we get a wave of job client failures all at the same time. the failures
come when the job is about to complete. when we look at the job history files - the jobs are
actually complete. Here's the stack:
> 08/09/27 02:18:00 INFO mapred.JobClient:  map 100% reduce 98%
> 08/09/27 02:18:41 INFO mapred.JobClient:  map 100% reduce 99% 
> java.lang.NullPointerException
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:993)
> 	at com.facebook.hive.common.columnSetLoader.main(columnSetLoader.java:535)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message