hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3245) Provide ability to persist running jobs (extend HADOOP-1876)
Date Thu, 12 Jun 2008 19:30:46 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amar Kamat updated HADOOP-3245:
-------------------------------

    Attachment: HADOOP-3245-v2.5.patch

Attaching a review-patch implementing the above discussed design. Testing and optimizations
are in progress. There is one known issue with the design and hence the patch is *incomplete*.

Consider the following case  [Notations : JT@ti means JT (re)started at time t1, Ti@TTj means
Task Ti completed on Tracker TTj]
1) TT1 asks for a task and the JT@t1 schedules map M1 on TT1
2) M1 finishes on TT1 and JT is updated
3) TT2 asks for a task and the JT@t1 schedules reduce R1 on TT2
4) R1 asks for map-completion-event and gets M1@TT1
5) R1 adds M1@TT1 to the fetch list
6) JT@t1 restarts and comes up as JT@t2.
7) TT3 asks for a task and the JT@t2 schedules reduce M1 on TT3
8) M1 finishes on TT3 and M1@TT3 is added as map-completion-event
9) TT2 SYNCs up with JT@t2 and gets the map completion event
10) R1 get M1@TT3 and ignores it since it already had M1@TT1.
11) TT1 goes down.

The design prefers the old map output location and silently ignores the new task completion
event. In such a case R1 has missed the new event and will keep re-trying at the old location.
Even though R1 will report fetch failures, it will be a _no-op_ since JT@t2 doesnt know about
M1@TT1.  

JT@t2 thinks that M1 is complete while R1@TT2 will wait for map-completion-event of M1 and
hence the job will be stuck. Note that this also true if TT1 joins back after M1@TT3 completes
where JT@t2 will delete the output of M1@TT1. Following is the change that might help to overcome
this problem.

Let the reducers fetch data for the same map from multiple sources (i.e R1 will keep fetching
data from M1@TT1 and also from M1@TT3). The one that finishes first will invalidate the other.
One optimization that can be done is that the reducer can continue fetching from the old output
(since the timestamp is always there) and switch to the new event once there is a failure
from the old event (i.e keep M1@TT3 as a backup and keep fetching from M1@TT1 until that fails
after which switch to M1@TT3).

----
Thoughts?

> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>
>                 Key: HADOOP-3245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3245
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3245-v2.5.patch
>
>
> This could probably extend the work done in HADOOP-1876. This feature can be applied
for things like jobs being able to survive jobtracker restarts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message