hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3675) Provide more flexibility in the way tasks are run
Date Tue, 01 Jul 2008 17:40:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609647#action_12609647
] 

Doug Cutting commented on HADOOP-3675:
--------------------------------------

In general, I like this approach.

> As part of this issue I would also like to provide two other TaskWrappers

Yes, I think the initial patch should provide at least one other implementation, to prove
the utility of the API.  The thread-per-task approach has often been requested, and is a thus
a great candidate.

A "jail" implemented with 'chroot' that isolates users would also be very useful.  If a new
root directory is created per user then we should not need more than one additional uid. 
The tasktracker's uid would need sudo privledges in order to run 'chroot', so we would want
to run user tasks as a different uid, but all user tasks could run as the same uid, but each
with a different root filesystem.  However such a capability might better be added in a separate
issue...

> We might consider deprecating "mapred.child.java.opts"

If the TaskWrapper implementation is passed the Configuration, can't this property continue
to be used by SeperateVMTaskWrapper?


> Provide more flexibility in the way tasks are run
> -------------------------------------------------
>
>                 Key: HADOOP-3675
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3675
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Brice Arnould
>            Assignee: Brice Arnould
>            Priority: Minor
>         Attachments: TaskWrapper_v0.patch
>
>
> *The aim*
> With [HADOOP-3421] speaking about sharing a cluster among more than one organization
(so potentially with non-cooperative users), and posts on the ML speaking about virtualization
and the ability to re-use the TaskTracker's VM to run new tasks, it could be useful for admins
to choose the way TaskRunners run their children. 
> More specifically, it could be useful to provide a way to imprison a Task in its working
directory, or in a virtual machine.
> In some cases, reusing the VM might be useful, since it seems that this feature is really
wanted ([HADOOP-249]).
> *Concretely*
> What I propose is a new class, called called SeperateVMTaskWrapper which contains the
current logic for running tasks in another JVM. This class extends another, called TaskWrapper,
which could be inherited to provide new ways of running tasks.
> As part of this issue I would also like to provide two other TaskWrappers : the first
would run the tasks as Thread of the TaskRunner's VM (if it is possible without too much changes),
the second would use a fixed pool of local unix accounts to insulate tasks from each others
(so potentially non-cooperating users will be hable to share a cluster, as described in [HADOOP-3421]).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message