hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Cox (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3280) virtual address space limits break streaming apps
Date Fri, 18 Apr 2008 19:08:21 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590575#action_12590575

Rick Cox commented on HADOOP-3280:

I'd guess there are many users who do not want Hadoop to limit tasks (be they Java or streaming).
When a cluster exists to run specific tasks, it seems reasonable that they can use all of
its resources.

On this issue, a default {{ulimit -v}} will cause some pretty strange failures while also
failing to prevent resource exhaustion in other cases. For example, some tasks may mmap multi-GB
files but touch only a few pages. Others may link libraries that require 100s of MB of address
space for code that's never executed (and thus never read). Still others may fork off lots
of sub-processes and thus ultimately consume more RAM than any single process's virtual address
space. (Btw, these examples are all taken from our deployed Hadoop apps.)

Further, when these tasks hit the virtual address space limit, it's likely they'll fail in
confusing, difficult to debug ways, since few apps are written to gracefully handle that case,
and when run outside of Hadoop the same commands will work fine unless the user reads the
streaming code and notices that it is imposing this limit. (This is in contrast to the -Xmx
limit, which can actually influence the garbage collector to be more aggressive, is a commonly
used java option, and produces relatively clear OutOfMemoryErrors on failure.)

This is why I don't think {{ulimit -v}} is the right approach *in general*. That doesn't mean
it's not the right approach for specific situations, and hence the original proposal for a
wrapper script (possibly one mandated by the cluster admin) is attractive. In other specific
situations, {{ulimit -m}} might be more effective than {{ulimit -v}}, or some {{jail}}-like
mechanism might be employed, and of course Windows users will need something else. Adding
support for all the different ways resources might be limited to streaming does not seem practical.

(I realize this would all have been much more useful to bring up in the original issue, and
apologize for not following that one more closely. As one path forward, we could reopen 2765
and continue this discussion there.)

> virtual address space limits break streaming apps
> -------------------------------------------------
>                 Key: HADOOP-3280
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3280
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Rick Cox
>            Priority: Blocker
>             Fix For: 0.17.0
>         Attachments: HADOOP-3280_0_20080418.patch
> HADOOP-2765 added a mandatory, hard virtual address space limit to streaming apps based
on the Java process's -Xmx setting.
> This makes it impossible to run a 64-bit streaming app that needs large address spaces
under a 32-bit JVM, even if one is otherwise willing to dramatically increase the -Xmx setting
without cause. Also, unlike Java's -Xmx limit, the virtual address space limit for an arbitrary
UNIX process does not necessarily correspond to RAM usage, so it's likely to be a relatively
difficult to configure limit.
> 2765 was originally opened to allow an optional wrapper script around streaming tasks,
one use case for which was setting a ulimit. That approach seems much less intrusive and more
flexible than the final implementation. The ulimit can also be trivially set by the streaming
task itself without any support from Hadoop.
> Marking this as an 0.17 blocker because it will break deployed apps and there is no workaround

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message