hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4664) Parallelize job initialization
Date Thu, 05 Mar 2009 04:31:56 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679067#action_12679067

Amar Kamat commented on HADOOP-4664:

Few comments :
# I feel Matei's implementation is simpler and does not involve the overhead of adding an
extra thread. Tom, can you plz explain why {{ThreadPoolExecutor}} should be used?
# In {{JobInProgress.initTasks()}}, the first dfs call is made via {{JobHistory.logSubmitted()}}.
This can also block on a dfs call made in {{JobHistory.getJobHistoryFileName()}} thus blocking
all the other threads. Hence there is a corner case where all the threads will be blocked
(on {{JobHistory}}). Here are the apis which are synchronized and might block on a dfs call
 ##  {{JobHistory.getJobHistoryFileName()}} within {{JobHistory.logSubmitted()}}
 ##  {{JobTracker.finalizeJob()}} and {{JobHistory.finalizeRecovery()}} within {{JobTracker.finalizeJob()}}
# All the api's invoked from {{JobInProgress.initTasks()}} should be made thread safe. Example,
we should document that {{JobTracker.resolveAndAddToTopology()}} should be thread safe. Following
are the apis that should be made thread safe
|JobHistory|logSubmitted() / logInited() / logFinished() / logFailed() / logJobPriority()|openJobs|
|JobTracker|storeCompletedJob()|completedJobStatusStore(looks at store() etc)|

 Hey can you plz check if there are other such apis.
In future we might want to associate a timer with each thread. We really dont want 3 out of
4 threads to be blocked for 1hr on dfs operations. But for now I think its a premature step.

> Parallelize job initialization
> ------------------------------
>                 Key: HADOOP-4664
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4664
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Matei Zaharia
>            Assignee: Jothi Padmanabhan
>            Priority: Blocker
>             Fix For: 0.20.0
>         Attachments: hadoop-4664-v1.patch, parallel-job-init-v1.patch
> The job init thread currently initializes one job at a time. However, this is a lengthy
and partly IO-bound process because all of the job's block locations need to be resolved through
the namenode and a map of them needs to be built. It can take tens of seconds. As a result,
the cluster sometimes initializes jobs too slowly for full utilization to be achieved, if
there are many small jobs queued up. It would be better to have a pool of threads that initialize
multiple jobs in parallel. One thing to be careful of, however, is not causing deadlocks or
holding locks for too long in these threads.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message