hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4729) job history UI not showing all job attempts
Date Thu, 01 Nov 2012 14:41:15 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488729#comment-13488729

Robert Joseph Evans commented on MAPREDUCE-4729:

The patch looks good to me.  I like how we short circuit the reading of the history file because
we know that what we want to read is in a single chunk. However, I am a bit conflicted about
where the parsing of the job history file is happening.  I kind of feel that it should go
in the recovery service. In this case we are only doing a partial recovery, not a full recovery.
 I can see in the future we would want to add in other things to the partial recovery, like
when/if we add in a summary of the history file for fast reading it would be nice to propagate
that to the next file no matter what happens so that the history server can show a full view
of what is happened in the application.  But, I am fine with it the way it is and if you feel
strongly the other way feel free to check it in as is +1. 
> job history UI not showing all job attempts
> -------------------------------------------
>                 Key: MAPREDUCE-4729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4729
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>    Affects Versions: 0.23.3
>            Reporter: Thomas Graves
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: MAPREDUCE-4729-20121031.txt
> We are seeing a case where a job runs but the AM is running out of memory in the first
3 attempts. The job eventually finishes on the 4th attempt.  When you go to the job history
UI for that job, it only shows the last attempt.  This is bad since we want to see why the
first 3 attempts failed.
> The RM web ui shows all 4 attempts. 
> Also I tested this locally by running "kill" on the app master and in that case the history
server UI does show all attempts.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message