whirr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars George (JIRA)" <j...@apache.org>
Subject [jira] Commented: (WHIRR-148) Hadoop jobs fail on large EC2 instances, possibly RHEL6 related
Date Tue, 04 Jan 2011 19:35:45 GMT

    [ https://issues.apache.org/jira/browse/WHIRR-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977424#action_12977424
] 

Lars George commented on WHIRR-148:
-----------------------------------

This happened again in a different context and it seems that may be caused by RHEL6. It has
glibc 2.11 which includes a new allocator which uses per-thread heaps for fast allocation.
By default it will map 64M chunks and up to 8*num_cores on a 64-bit system, so you can expect
4G of virtual memory usage in any highly threaded app. You can constrain the number of allocation
arenas by setting MALLOC_ARENA_MAX=4 for example. Or even lower. We may simply add this to
the "init" scripts and try?


> Hadoop jobs fail on large EC2 instances, possibly RHEL6 related
> ---------------------------------------------------------------
>
>                 Key: WHIRR-148
>                 URL: https://issues.apache.org/jira/browse/WHIRR-148
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hadoop
>    Affects Versions: 0.3.0
>            Reporter: Tom White
>            Assignee: Tom White
>
> When using a m1.large or c1.xlarge hardware-id, jobs fail with a error like:
> {noformat}
> FAILED
> java.io.IOException: Task process exit with nonzero status of 134.
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message