Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 41CA31040B for ; Sat, 20 Apr 2013 23:20:08 +0000 (UTC) Received: (qmail 50278 invoked by uid 500); 20 Apr 2013 23:20:03 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 50152 invoked by uid 500); 20 Apr 2013 23:20:03 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 50145 invoked by uid 99); 20 Apr 2013 23:20:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 20 Apr 2013 23:20:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [209.85.128.52] (HELO mail-qe0-f52.google.com) (209.85.128.52) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 20 Apr 2013 23:19:57 +0000 Received: by mail-qe0-f52.google.com with SMTP id jy17so3433398qeb.39 for ; Sat, 20 Apr 2013 16:19:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type:x-gm-message-state; bh=L5PjKgqFHRZw98EkDuW7P6jantOnFCZkK3SdHm7VlE0=; b=SnqU9rrcQSibYtlIgeJCefGzZJeKQHMDMY7j3ho6hgeXmn4STjKPH8yMVTQ05BrB9l D9LMY281MFZs15EXZoTMovE0cP38XEUeUy92zKDtOh6CCqYABsIF6wmcTkKa9cBYKO43 J4wayZy6BBbYIQc0QsbCp1A6TYYti7m8yjNuyoCeB8bqh9UdDsTEtzawhtwW8McFXDQn S+79mM8OFu0aGCnABDCfPHuYcM1PZxsw/lidGDD0kyos6zs/cnpOYh+pG+d8r7AM4fMi PT6ApiA9MfwEziXk13uZqqr8CbxCQ5v6l+lNcEApKgqQUXJfBrbuG2fn8Dr4ccN3PAIU 1e+w== MIME-Version: 1.0 X-Received: by 10.224.29.145 with SMTP id q17mr18176004qac.75.1366499956349; Sat, 20 Apr 2013 16:19:16 -0700 (PDT) Received: by 10.49.36.200 with HTTP; Sat, 20 Apr 2013 16:19:16 -0700 (PDT) In-Reply-To: References: Date: Sat, 20 Apr 2013 16:19:16 -0700 Message-ID: Subject: Re: rack awareness in hadoop From: Aaron Eng To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7bdc92d67bebd804dad31093 X-Gm-Message-State: ALoCoQkDt5dMy7BVjv4n7vHqsHlSn4bBtsLJ8h73YI1h+9fA/XqvcRVO4fnaHIDxL80sT3UO9Sfq X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdc92d67bebd804dad31093 Content-Type: text/plain; charset=ISO-8859-1 The problem is probably not related to the JVM memory so much as the Linux memory manager. The exception is in java.lang.UNIXProcess.(UNIXProcess.java:148) which would imply this is happening when trying to create a new process. The initial malloc for the new process space is being denied by the memory manager. There could be many reasons why this happens, though the most likely is your overcommit settings and swap space. I'd suggest reading through these details: https://www.kernel.org/doc/Documentation/vm/overcommit-accounting On Sat, Apr 20, 2013 at 4:00 PM, Kishore Yellamraju < kishore@rocketfuelinc.com> wrote: > All, > > I have posted this question to CDH ML , but i guess i can post it here > because its a general hadoop question. > > When the NN or JT gets the rack info, i guess it stores the info in > memory. can i ask you where in the JVM memory it will store the results ( > perm gen ?) ? . I am getting "cannot allocate memory on NN and JT " and > they have more than enough memory. when i looked at JVM usage stats i can > see it doesnt have enough perm free space.so if its storing the values in > perm gen then there is a chance of this memory issues. > > > Thanks in advance !!! > > > exception that i see in logs : > > java.io.IOException: Cannot run program "/etc/hadoop/conf/topo.sh" (in > directory "/usr/lib/hadoop-0.20-mapreduce"): java.io.IOException: error=12, > Cannot allocate memory > at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) > at org.apache.hadoop.util.Shell.runCommand(Shell.java:206) > at org.apache.hadoop.util.Shell.run(Shell.java:188) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381) > at > org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:242) > at > org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:180) > at > org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119) > at > org.apache.hadoop.mapred.JobTracker.resolveAndAddToTopology(JobTracker.java:2750) > at > org.apache.hadoop.mapred.JobInProgress.createCache(JobInProgress.java:593) > at > org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:765) > at > org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3775) > at > org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:90) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > Caused by: java.io.IOException: java.io.IOException: error=12, Cannot > allocate memory > at java.lang.UNIXProcess.(UNIXProcess.java:148) > at java.lang.ProcessImpl.start(ProcessImpl.java:65) > at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) > ... 14 more > 2013-04-20 02:07:28,298 ERROR org.apache.hadoop.mapred.JobTracker: Job > initialization failed: > java.lang.NullPointerException > > > -Thanks > kishore kumar yellamraju |Ground control operations| > kishore@rocketfuel.com | 408.203.0424 > > --047d7bdc92d67bebd804dad31093 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable The problem is probably not related to the JVM memory so much as the Linux = memory manager. =A0The exception is in=A0java.lang.UNIXProcess.<init>= ;(UNIXProcess.java:148) which would imply this is happening when tr= ying to create a new process. =A0The initial malloc for the new process spa= ce is being denied by the memory manager. =A0There could be many reasons wh= y this happens, though the most likely is your overcommit settings and swap= space. =A0I'd suggest reading through these details:

https://www.kernel.org/doc/Documentation/vm/overcommit-acco= unting

On Sat, Apr 20, 2013 at 4:00 PM, Kish= ore Yellamraju <kishore@rocketfuelinc.com> wrote:
All,

I have posted this ques= tion to CDH ML , =A0but i guess i can post it here because its a general ha= doop question.

When the NN or JT gets the rack info, i guess it stores the info in memory.= can i ask you where in the JVM memory it will store the results ( perm gen= ?) ? . =A0I am getting "cannot allocate memory on NN and JT " an= d they have more than enough memory. when i looked at JVM usage stats i can= see it doesnt=A0have=A0enough perm free space.so if its storing the values= in perm gen =A0then there is a chance of this memory issues.


Thanks in advance !!!


exception th= at i see in logs :

j= ava.io.IOException: Cannot run program "/etc/hadoop/conf/topo.sh"= (in directory "/usr/lib/hadoop-0.20-mapreduce"): java.io.IOExcep= tion: error=3D12, Cannot allocate memory
=A0 =A0 =A0 =A0 at java.lang.ProcessBuilder.start(ProcessBuilder.java:= 459)
=A0 =A0 =A0 =A0 at org.apache.hadoop.util.Shell.runCommand(S= hell.java:206)
=A0 =A0 =A0 =A0 at org.apache.hadoop.util.Shell.ru= n(Shell.java:188)
=A0 =A0 =A0 =A0 at org.apache.hadoop.util.Shell$ShellCommandExecutor.e= xecute(Shell.java:381)
=A0 =A0 =A0 =A0 at org.apache.hadoop.net.S= criptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMappin= g.java:242)
=A0 =A0 =A0 =A0 at org.apache.hadoop.net.ScriptBasedMapping$RawScriptB= asedMapping.resolve(ScriptBasedMapping.java:180)
=A0 =A0 =A0 =A0 = at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitch= Mapping.java:119)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.JobTracker.resolveAndAddTo= Topology(JobTracker.java:2750)
=A0 =A0 =A0 =A0 at org.apache.hado= op.mapred.JobInProgress.createCache(JobInProgress.java:593)
=A0 = =A0 =A0 =A0 at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgre= ss.java:765)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.JobTracker.initJob(Jo= bTracker.java:3775)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapred.E= agerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.= java:90)
=A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoolExecutor$Worker.runT= ask(ThreadPoolExecutor.java:886)
=A0 =A0 =A0 =A0 at java.util.con= current.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
=A0 =A0 =A0 =A0 at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: java.io.IOException: error=3D12, Canno= t allocate memory
=A0 =A0 =A0 =A0 at java.lang.UNIXProcess.<in= it>(UNIXProcess.java:148)
=A0 =A0 =A0 =A0 at java.lang.Process= Impl.start(ProcessImpl.java:65)
=A0 =A0 =A0 =A0 at java.lang.ProcessBuilder.start(ProcessBuilder.java:= 452)
=A0 =A0 =A0 =A0 ... 14 more
2013-04-20 02:07:28,29= 8 ERROR org.apache.hadoop.mapred.JobTracker: Job initialization failed:
java.lang.NullPointerException


-Thanks<= br>=A0kishore kumar yellamraju |Ground control operations|kishore@rocketfuel.com | 408= .203.0424

3D""


--047d7bdc92d67bebd804dad31093--