hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3618) JobClient should keep on retrying if the jobtracker is still initializing
Date Tue, 24 Jun 2008 14:24:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607622#action_12607622

Steve Loughran commented on HADOOP-3618:

I think it makes sense to not worry too much about making the sleep configurable. The jitter
is sometimes useful for dealing with mass numbers of callers, though the only time we had
a big problem it was with embedded hardware whose RNGs all initialised the same way. Randomness
is sometimes hard to find. 

When a full cluster reboots, its the data nodes that come up first; their boot time depends
on the state of their disks. The name node ought to come up faster if its a RAID5 FS, but
as it has to do playback it will take a while to go live. What happens to the job and task
trackers in this situation? Will they just sit around? Because if we arent saving a job list
over a cluster-crash there wont be a big set of jobs trying get restarted, not unless there
are external clients hitting the site hard. 

> JobClient should keep on retrying if the jobtracker is still initializing
> -------------------------------------------------------------------------
>                 Key: HADOOP-3618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3618
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3618.patch
> When the user submits the job while the jobtracker is still initializing, the jobclient
comes out with an exception. ideally the jobclient should keep on retrying until the jobtracker
is up and ready. This will also take care of HADOOP-3289. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message