hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Naisbitt" <jnais...@yahoo-inc.com>
Subject Review Request: MAPREDUCE-2489 Jobsplits with random hostnames can make the queue unusable
Date Mon, 23 May 2011 15:57:46 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/776/
-----------------------------------------------------------

Review request for hadoop-mapreduce.


Summary
-------

We saw an issue where a custom InputSplit was returning invalid hostnames (non-repeating)
for the splits that were then causing the JobTracker to attempt to excessively resolve host
names. This caused a major slowdown for the JobTracker. We should prevent invalid InputSplit
hostnames from affecting everyone else.

I propose we implement some verification for the hostnames to try to ensure that we only do
DNS lookups on valid hostnames (and fail otherwise). We could also fail the job after a certain
number of failures in the resolve.

NOTE: This requires the changes in HADOOP-7314


This addresses bug MAPREDUCE-2489.
    https://issues.apache.org/jira/browse/MAPREDUCE-2489


Diffs
-----

  trunk/ivy.xml 1125074 
  trunk/ivy/libraries.properties 1125074 
  trunk/src/contrib/mumak/src/java/org/apache/hadoop/mapred/SimulatorJobTracker.java 1125074

  trunk/src/java/org/apache/hadoop/mapred/JobInProgress.java 1125074 
  trunk/src/java/org/apache/hadoop/mapred/JobTracker.java 1125074 

Diff: https://reviews.apache.org/r/776/diff


Testing
-------


Thanks,

Jeffrey


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message