hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-4664) Parallelize job initialization
Date Mon, 17 Nov 2008 06:14:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648085#action_12648085
] 

matei edited comment on HADOOP-4664 at 11/16/08 10:12 PM:
------------------------------------------------------------------

In some initial testing of this patch on a jobtracker with a lot of old history files, I found
that the lock in JobHistory on getJobHistoryFileName and recoverJobHistoryFile was causing
most of the threads to block while one thread listed the directory, leading to no improvement.
However, Amar Kamat explained that HADOOP-4372 will help solve this issue. I'll wait on that
before trying to modify things myself. The patch provided here should still help when the
job init phase is limited more by CPU than by the history file scanning and creation.

      was (Author: matei):
    In some initial testing of this patch on a job with a lot of old history files, I found
that the lock in JobHistory on getJobHistoryFileName and recoverJobHistoryFile was causing
most of the threads to block while one thread listed the directory, leading to no improvement.
However, Amar Kamat explained that HADOOP-4372 will help solve this issue. I'll wait on that
before trying to modify things myself. The patch provided here should still help when the
job init phase is limited more by CPU than by the history file scanning and creation.
  
> Parallelize job initialization
> ------------------------------
>
>                 Key: HADOOP-4664
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4664
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Matei Zaharia
>         Attachments: parallel-job-init-v1.patch
>
>
> The job init thread currently initializes one job at a time. However, this is a lengthy
and partly IO-bound process because all of the job's block locations need to be resolved through
the namenode and a map of them needs to be built. It can take tens of seconds. As a result,
the cluster sometimes initializes jobs too slowly for full utilization to be achieved, if
there are many small jobs queued up. It would be better to have a pool of threads that initialize
multiple jobs in parallel. One thing to be careful of, however, is not causing deadlocks or
holding locks for too long in these threads.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message