hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1920) Job.getCounters() returns null when using a cluster
Date Wed, 07 Jul 2010 22:38:53 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886136#action_12886136

Aaron Kimball commented on MAPREDUCE-1920:

This is indeed the issue. Setting {{mapreduce.jobtracker.retirejobs}} to false allows things
to run correctly.

If I remove that setting, then it fails. I think this indicates a need to do some sort of
delay before retiring jobs. Otherwise the job client does not even display the counters in
the stdout when the job is finished, which is an unexpected result.

What is the best option going forward? Some that I can think of:
* mapred-default.xml could enable the completed job store for 1 hr by default. Power users
could override this if they need to
* we could add some code to delay job retiring for some minimum amount of time (10 minutes?)
* If the JobClient is still connected to the JT when the job finishes, the interaction could
be modified to locally-cache a copy of the counters before retiring the job. Then existing
references to the Job would have a guaranteed instance of the Counters available.
* At the very least, {{Job.getCounters()}} needs a javadoc comment that specifies that it
may return null. I think this is an incompatible change from 0.20. This suggestion is in addition
to any of the above three.

> Job.getCounters() returns null when using a cluster
> ---------------------------------------------------
>                 Key: MAPREDUCE-1920
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1920
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: Aaron Kimball
>            Priority: Critical
> Calling Job.getCounters() after the job has completed (successfully) returns null.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message