hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3245) Provide ability to persist running jobs (extend HADOOP-1876)
Date Tue, 05 Aug 2008 21:10:46 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Amar Kamat updated HADOOP-3245:

    Attachment: HADOOP-3245-v5.13.patch

Attaching a patch that implements JT restart using JobHistory.

_Changes :_
Currently the job history filename is of the following format history-timestamp_jt-hostname_jobid_username_jobname.
It was introduced in HADOOP-239 and the timestamp was added in the beginning since the job
names were not unique. It makes it difficult to guess the job history filename with history-timestamp.
So history-timestamp is removed as currently job-id is unique across restarts.
So for now we define 
master-file = jt-hostname_jobid_username_jobname.
tmp-file = master-file.tmp

_Working :_
0) Upon restart the JT goes in _safe_ mode. In safe mode all the trackers are asked to resend/replay
their heartbeat. 

1) For a new job, the history file is the _master-file_. For a restarted job, the history
is written to the _tmp_ file.

2) Following checks are made for a recovered job
  2.1) If the master file exists then delete the tmp file
  2.2) If the master file is missing then make the tmp file as master

3) Upon restart the master-file is read and default-history-parser is used to parse and recover
history records. These records are used to create taskStatus which is replayed in order. Before
replaying the JT waits for the jobs to be inited.

4) Once the replay is over, delete the _master-file_ to indicate that the _tmp_ file is more
recent. Note that on next restart the _tmp_ file will be used for recovery.

5) Once all the jobs are recovered, turn off the safe mode. JT will now process heartbeats
(called as successful re-connect). Also the registration window timer starts. JT waits for
_tracker-expiry-interval_ time after _last-tracker-re-connect_ before closing the window.
Once the window closes, JT is considered as _recovered_. This plays an important role in detecting
the trackers that went down while the JT was down. Upon _recovery_, JT re-executes all the
tasks that were on the lost trackers. 

6) Since the history can have some data missing, there can be a case where the _map-completion-event-list_
at the JT is smaller than the one at the tracker. Hence there is a rollback required upon
restart. Once the JT is out of safe mode, it passes this information (_map-events-list-size_)
to the tracker on the successful reconnect.

7) The tasktracker rollbacks few events and asks the child tasks to reset their index to 0.
Child tasks fetches  all the events back and filters out necessary events for further processing.
This is similar to the one discussed in approach #1.

8) Errors in history can cause the parser to fail. We have HADOOP-2403 to address this. For
now this patch encodes errors. This will replaced with the fix in HADOOP-2403.

9) Currently counters are stringified and written to history. It is not possible to recover
the counter back from the string and hence this patch encodes the counter-names so that they
can be easily recovered. Note that there is no encoding in the user space. Only the frameworks
history file has codes.

10) Once the job finishes the _tmp_ file is renamed to _master-file_. Similarly the history
files in the user directory also follow the same renaming cycle.

11) Job priority is logged on every change and hence its recovered.

_Issues :_
1) This approach/patch works fine with history on local fs. With history on HDFS, the history
file becomes visible but not available (i.e file-size = 0). The file becomes available only
on close(). Sync() documentation indicates that the file-data availability is not guaranteed.

2) Detecting job runtime is still an issue. 

We are working on it.

_Todo :_
1) Refactor common code.
2) Remove extra logs
3) For ease of testing JT killing facility is added to web-ui. There is some extra code to
support this. Clear it out.
4) To test the usage of {{sync()}}, there are periodic syncs done to the history files. This
is just for testing.
5) Optimize encoding/decoding.
6) Group together all the recovery code under something like {{JobTrackerRecoveryManager}}.
Note that the logs/debugging-code/testing-code is still a part of this patch as I am testing

> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>                 Key: HADOOP-3245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3245
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3245-v2.5.patch, HADOOP-3245-v2.6.5.patch, HADOOP-3245-v2.6.9.patch,
HADOOP-3245-v4.1.patch, HADOOP-3245-v5.13.patch
> This could probably extend the work done in HADOOP-1876. This feature can be applied
for things like jobs being able to survive jobtracker restarts.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message