hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Sreekumar <hsreeku...@clickable.com>
Subject Re: Hadoop mapred - Out of swap space
Date Sat, 06 Nov 2010 09:16:01 GMT
It's an out of mem error, so I feel it has to do with ram rather tham disk
space. Did you check if it's swapping? (top/htop)... Is your reduce phase
very mem-intensive? Seems to be a memory leak somewhere.. What does htop
say? What processes are you running on each node? What does the log file
that it is showing say?

On Sat, Nov 6, 2010 at 2:36 PM, Shavit Netzer <Shavit@conduit.com> wrote:

> 7GB
>
> Sent from my mobile
>
> On 06/11/2010, at 11:00, "Hari Sreekumar" <hsreekumar@clickable.com
> <mailto:hsreekumar@clickable.com>> wrote:
>
> What's the RAM on each node?
>
> On Sat, Nov 6, 2010 at 11:03 AM, Shavit Netzer <Shavit@conduit.com<mailto:
> Shavit@conduit.com>> wrote:
>
> Hello,
>
> I have a question regarding MapRed jobs.
>
> I have 24 nodes, each node have 4 disks (mnt – mnt3), 500GB each mnt.
>
> All balanced ( I used the balancer, except mnt, which have 97% used).
>
> My question is:
> I got the following error and I relate it to the disk space (maybe I'm
> wrong).
>
> Maybe there is a configuration that I can add, change in order to have few
> more retries on separate disk:
>
>
> 10/10/27 21:59:01 INFO mapred.JobClient:  map 100% reduce 26%
>
> 10/10/27 21:59:02 INFO mapred.JobClient: Task Id :
> attempt_201010201240_4059_r_000023_0, Status : FAILED
>
> java.io.IOException: Task process exit with nonzero status of 134.
>
>              at
> org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462)
>
>              at
> org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403)
>
>
>
> attempt_201010201240_4059_r_000023_0: #
>
> attempt_201010201240_4059_r_000023_0: # A fatal error has been detected by
> the Java Runtime Environment:
>
> attempt_201010201240_4059_r_000023_0: #
>
> attempt_201010201240_4059_r_000023_0: # java.lang.OutOfMemoryError:
> requested 32744 bytes for ChunkPool::allocate. Out of swap space?
>
> attempt_201010201240_4059_r_000023_0: #
>
> attempt_201010201240_4059_r_000023_0: #  Internal Error
> (allocation.cpp:117), pid=15974, tid=1089702224
>
> attempt_201010201240_4059_r_000023_0: #  Error: ChunkPool::allocate
>
> attempt_201010201240_4059_r_000023_0: #
>
> attempt_201010201240_4059_r_000023_0: # JRE version: 6.0_14-b08
>
> attempt_201010201240_4059_r_000023_0: # Java VM: Java HotSpot(TM) 64-Bit
> Server VM (14.0-b16 mixed mode linux-amd64 )
>
> attempt_201010201240_4059_r_000023_0: # An error report file with more
> information is saved as:
>
> attempt_201010201240_4059_r_000023_0: #
>
> /mnt2/hadoop/mapred/local/taskTracker/jobcache/job_201010201240_4059/attempt_201010201240_4059_r_000023_0/work/hs_err_pid15974.log
>
> attempt_201010201240_4059_r_000023_0: #
>
> attempt_201010201240_4059_r_000023_0: # If you would like to submit a bug
> report, please visit:
>
> attempt_201010201240_4059_r_000023_0: #
> http://java.sun.com/webapps/bugreport/crash.jsp
>
> attempt_201010201240_4059_r_000023_0: #
>
> Regards,
> Shavit
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message