[ https://issues.apache.org/jira/browse/HADOOP-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601612#action_12601612 ] Jothi Padmanabhan commented on HADOOP-3478: ------------------------------------------- This is related to HADOOP-3327. The current design for Map re-execution on fetch failures (in shuffle) is to wait for 3 fetch failure notifications for a given map before re-executing it. For load balancing, the reducers randomize the order of maps to fetch. If a fetch for a particular map fails, there is no guarantee that this map would be attempted for fetch in the next iteration. What this implies is that if a particular location is faulty, it might take a long time to determine this and re-execute all the maps for this location. For example, consider the simplest case of 1 reducer, fetching maps M1,M2 and M3 from a single location L1. Assume that the location L1 developed a hardware failure after the map execution and so none of the maps, M1,M2 or M3 is available. The following could be the order of execution in the reducer 1. Try Fetching M1 from L1 --> Failure 2. Try Fetching M2 from L1 --> Failure 3. Try Fetching M3 from L1 --> Failure ==> At this point we have three fetch failures from Location L1, but they are for different Maps. So, the map re-execution criteria is not met. 4. Try Fetching M1 from L1 --> Failure 5. Try Fetching M2 from L1 --> Failure 6. Try Fetching M3 from L1 --> Failure ==> Now, there are six failures from L1, but still the maps are not re-executed. 7. Try Fetching M1 from L1 --> Failure ==> Now, M1 is re-executed 8. Try Fetching M2 from L1 --> Failure ==> Now, M2 is re-executed 9. Try Fetching M2 from L1 --> Failure ==> Now, M3 is re-executed A better alternative could be to do this. Force the order in which maps are executed in a particular location. A simple mechanism could be to sort by Map Ids (for that location) and then fetch them in that order. An additional heuristic \ could be added to hasten the re-execution of maps if the number of failures on a particular location exceeds some threshold (irrespective of the number of fetch failures for that particular map id). In the above example, let us assume that the sorted map Ids are in the order M1,M2 and M3. Now, considering the above example 1. Try Fetching M1 from L1 --> Failure 2. Try Fetching M1 from L1 --> Failure 3. Try Fetching M1 from L1 --> Failure ==> Re-execute Map M1 now. 4. Try Fetching M2 from L1 --> Failure 5. Try Fetching M2 from L1 --> Failure ==> Assuming the threshold for max failures from a location is 5, M2 will be re-executed now. 6. Try Fetching M3 from L1 --> Failure Re-execute M3 now. > The algorithm to decide map re-execution on fetch failures can be improved > -------------------------------------------------------------------------- > > Key: HADOOP-3478 > URL: https://issues.apache.org/jira/browse/HADOOP-3478 > Project: Hadoop Core > Issue Type: Improvement > Components: mapred > Reporter: Jothi Padmanabhan > > The algorithm to decide map re-execution on fetch failures can be improved. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.