hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3581) Prevent memory intensive user tasks from taking down nodes
Date Tue, 15 Jul 2008 14:33:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613622#action_12613622

Vinod Kumar Vavilapalli commented on HADOOP-3581:

bq. Could we solve this by adding an extra argument specifying the JobId and the UserId to
enable the script to do by job/user accounting ?
I am not sure if I understand this well enough. If you meant "pass JobId/UserId to the script
and do per-job/per-user accounting only", then that won't help - we need overall accounting
across all tasks.

bq. The wrapper I proposed before could solve this problem as a side effect (with /etc/security/limits.conf).
But it might not be portable and your solution is maybe for this case.
limits.conf approach is already being evaluated, it doesn't solve the current problem. See
this comment on this very JIRA - https://issues.apache.org/jira/browse/HADOOP-3581?focusedCommentId=12607650#action_12607650

bq. I'm afraid that many functionality will not to be available for threaded tasks anyway.
My next proposition will include a fallback mecanism so you should'nt have to take this in
This looks like an interesting problem - how do we manage resource usage by each thread? Any
thread resource management support in Java? What is the use-case for threaded tasks in the
first place? If cost of per-taskJvm is the only reason why we want to run each task in a thread
instead of a jvm, we can still achieve resource management of all tasks by forking one single
jvm and running all tasks as threads of this jvm. This way we can meet our objective here
too - shield hadoop from user code.

> Prevent memory intensive user tasks from taking down nodes
> ----------------------------------------------------------
>                 Key: HADOOP-3581
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3581
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: patch_3581_0.1.txt
> Sometimes user Map/Reduce applications can get extremely memory intensive, maybe due
to some inadvertent bugs in the user code, or the amount of data processed. When this happens,
the user tasks start to interfere with the proper execution of other processes on the node,
including other Hadoop daemons like the DataNode and TaskTracker. Thus, the node would become
unusable for any Hadoop tasks. There should be a way to prevent such tasks from bringing down
the node.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message