hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3245) Provide ability to persist running jobs (extend HADOOP-1876)
Date Wed, 28 May 2008 04:35:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600364#action_12600364
] 

Amar Kamat commented on HADOOP-3245:
------------------------------------

bq. III) The logic for detecting lost TT should not rely on missing data structures but use
some kind of book keeping. We can now use 'missing data structures logic' for detecting when
the TT should SYNC. Note that detecting a TT as lost (missing TT details) if different from
declaring it as lost (10min gap in heartbeat).

These are two different cases where 
1) _Lost TT_  will have _initial contact_ as *false* while the previous heartbeat will be
present
2) _Restarted JT_ will have _initial contact_ as *false* while the previous heartbeat will
also be missing.
Hence there is no need to fix the lost TT logic.

> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>
>                 Key: HADOOP-3245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3245
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>             Fix For: 0.18.0
>
>
> This could probably extend the work done in HADOOP-1876. This feature can be applied
for things like jobs being able to survive jobtracker restarts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message