hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3245) Provide ability to persist running jobs (extend HADOOP-1876)
Date Mon, 21 Jul 2008 15:59:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615283#action_12615283
] 

Hemanth Yamijala commented on HADOOP-3245:
------------------------------------------

Using job history seems a reasonable approach. Some concerns though:

- We need to find out a good buffer size to use for writing to the history file. A small value
could have an impact on performance due to faster flushes. A large value could result in a
lot of task events not being flushed and hence unavailable for the JobTracker on restart.
We are exploring what an ideal value for this is.
- For a large job with typical job history outputs, we need to make sure the time to parse
and reconstruct state is not too bad.
- We still need something like the SYNC operation described above, because in the window where
something is written to job history but not flushed, these events would be lost for the JT
upon restart. So, there will need to be a way to tell the TTs to reset these events. However,
this count is going to be much smaller than what can happen in the approach currently implemented.

We're doing some tests related to the first two points and then can discuss the results.

The completed task state in RAM is not introduced in this patch. I would recommend it be addressed
in another JIRA, if it is an issue.

> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>
>                 Key: HADOOP-3245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3245
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3245-v2.5.patch, HADOOP-3245-v2.6.5.patch, HADOOP-3245-v2.6.9.patch,
HADOOP-3245-v4.1.patch
>
>
> This could probably extend the work done in HADOOP-1876. This feature can be applied
for things like jobs being able to survive jobtracker restarts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message