hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-6480) archive-logs tool may miss applications
Date Thu, 17 Sep 2015 00:12:45 GMT
Robert Kanter created MAPREDUCE-6480:

             Summary: archive-logs tool may miss applications
                 Key: MAPREDUCE-6480
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6480
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 2.8.0
            Reporter: Robert Kanter
            Assignee: Robert Kanter

MAPREDUCE-6415 added a tool to archive aggregated logs into HAR files.  It seeds the initial
list of applications to process based on apps which have finished aggregated, according to
the RM.  However, the RM doesn't remember completed applications forever (e.g. failover),
so it's possible for the tool to miss applications if they're no longer in the RM.  

Instead, we should do the following:
# Seed the initial list of apps based on the aggregated log directories
# Make the RM not consider applications "complete" until their log aggregation has reached
a terminal state (i.e. DISABLED, SUCCEEDED, FAILED, TIME_OUT).  

#2 will allow #1 to assume that any apps not found in the RM are done aggregating.  #2 on
it's own should cover most cases though

This message was sent by Atlassian JIRA

View raw message