Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 80109 invoked from network); 25 Jun 2010 15:09:30 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Jun 2010 15:09:30 -0000 Received: (qmail 63049 invoked by uid 500); 25 Jun 2010 15:09:27 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 62985 invoked by uid 500); 25 Jun 2010 15:09:27 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 62977 invoked by uid 500); 25 Jun 2010 15:09:26 -0000 Delivered-To: apmail-hadoop-core-user@hadoop.apache.org Received: (qmail 62974 invoked by uid 99); 25 Jun 2010 15:09:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jun 2010 15:09:26 +0000 X-ASF-Spam-Status: No, hits=0.4 required=10.0 tests=AWL,FREEMAIL_FROM,SPF_HELO_PASS,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jun 2010 15:09:21 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1OSAWa-00011L-W3 for core-user@hadoop.apache.org; Fri, 25 Jun 2010 08:09:00 -0700 Message-ID: <28993355.post@talk.nabble.com> Date: Fri, 25 Jun 2010 08:09:00 -0700 (PDT) From: nileshnjoshi To: core-user@hadoop.apache.org Subject: memory management of capacity scheduling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Nabble-From: nileshnjoshi@gmail.com Hi, Setup Info: I have 2 node hadoop (20.2) cluster on Linux boxes. HW info: 16 CPU (Hyperthreaded) RAM: 32 GB I am trying to configure capacity scheduling. I want to use memory management provided by capacity scheduler. But I am facing few issues. I have added hadoop-0.20.2-capacity-scheduler.jar in lib. Also added =E2=80=98mapred.jobtracker.taskScheduler=E2=80=99 in hadoop-site.xml I have added below in capacity-scheduler.xml file, but I get error: mapred.tasktracker.vmem.reserved 26624m A number, in bytes, that represents an offset. The total VMEM on the machine, minus this offset, is the VMEM node-limit for all tasks, and their descendants, spawned by the TT. mapred.task.default.maxvmem 512k A number, in bytes, that represents the default VMEM task-limit associated with a task. Unless overridden by a job's setting, this number defines the VMEM task-limit. mapred.task.limit.maxvmem 4096m A number, in bytes, that represents the upper VMEM task-limit associated with a task. Users, when specifying a VMEM task-limit fo= r their tasks, should not specify a limit which exceeds this amount. mapred.tasktracker.pmem.reserved 26624m Physical Memory Error: 2010-06-25 08:02:06,026 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Call to node1.hadoopcluster.com/192.168.1.241:9001 failed on local exception: java.io.IOException: Connection reset by peer at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:514) at org.apache.hadoop.mapred.TaskTracker.(TaskTracker.java:934= ) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833) Caused by: java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:33) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:234) at sun.nio.ch.IOUtil.read(IOUtil.java:207) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.= java:55) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142= ) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) at java.io.FilterInputStream.read(FilterInputStream.java:128) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:27= 6) at java.io.BufferedInputStream.fill(BufferedInputStream.java:230) at java.io.BufferedInputStream.read(BufferedInputStream.java:249) at java.io.DataInputStream.readInt(DataInputStream.java:382) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) Question: How to fix this issue? Is there any step by step guide for configuring capacity scheduling? Let me know if you need more information about configuration. Thanks and Regards, -Shashank --=20 View this message in context: http://old.nabble.com/memory-management-of-ca= pacity-scheduling-tp28993355p28993355.html Sent from the Hadoop core-user mailing list archive at Nabble.com.