hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-5659) Fair share schduler may support preemption only with a specific pool
Date Sun, 12 Apr 2009 23:43:14 GMT
Fair share schduler may support preemption only with a specific pool

                 Key: HADOOP-5659
                 URL: https://issues.apache.org/jira/browse/HADOOP-5659
             Project: Hadoop Core
          Issue Type: Improvement
          Components: contrib/fair-share
            Reporter: dhruba borthakur

There are a set of jobs that helps to keep the cluster resources being used optimally. For
example, there are data sets that are made of a multiple files in a directory. These part-xxx
files could be concatenated to a relatively few files (to reduce memory  pressure on the namenode).
Also, there are files that could be compressed more efficiently (e.g. bzip2) to reduce save
on disk usage. These are kind of system-wellcare jobs that should run only if it does not
impact any other "real" user of the cluster. On an idle cluster, these wellcare jobs should
use all availale system resources. When a real user submits a job, the wellcare job(s) should
be pre-empted. If a scheduler can support pre-emption only for jobs in a specified pool, then
I can submit these well-care jobs to that special pool. Real user's jobs will never get pre-empted;but
the wellcare jobs can get pre-empted as soon as there is resource contention. If a task of
well-care jobs is pre-empted more than a configured max, the entire wellcare job will fail..
that this is the behaviour I want. The wellcare jobs would run in idle slots as long as all
user-submitted jobs have been satisfied, but would be preempted as soon as user jobs require
any of those slots.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message