hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Roelofs (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-2269) Rumen TopologyBuilder ignores hostname info in ReduceAttemptFinishedEvent
Date Sat, 15 Jan 2011 01:51:46 GMT
Rumen TopologyBuilder ignores hostname info in ReduceAttemptFinishedEvent

                 Key: MAPREDUCE-2269
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2269
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tools/rumen
    Affects Versions: 0.22.0
            Reporter: Greg Roelofs
            Priority: Minor

Rumen's TopologyBuilder component attempts to build up a view of a complete cluster over time
by processing many jobs' history files (per discussion with Dick King).  It appears to be
designed to take a greedy approach to this, pulling hostnames and rack info out of any JobHistory
events that have them.

In particular, it pulls split locations out of TaskStartedEvent and hostnames out of TaskAttemptUnsuccessfulCompletionEvent
(used for all task types) and TaskAttemptFinishedEvent (used only for setup and cleanup task
attempts).  It omits hostnames in TaskAttemptStartedEvents produced by map attempts (perhaps
intentional given the split info from TaskStartedEvents?) and in ReduceAttemptFinishedEvents
(apparently unintentional).  The latter resulted in an empty topology and an ArrayIndexOutOfBoundsException
in a reduce-only unit test (TestTaskPerformanceSplitTranscription modified for an upcoming

I'm not sure if this is intended behavior or a bug; feel free to close if the former.  It
seemed like TaskAttemptFinishedEvent might have been mistakenly believed to cover REDUCE_ATTEMPT_FINISHED.
 (If so, the fix to TopologyBuilder.java is trivial.)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message