Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 7157 invoked from network); 5 Nov 2010 18:25:07 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 5 Nov 2010 18:25:07 -0000 Received: (qmail 9921 invoked by uid 500); 5 Nov 2010 18:25:36 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 9854 invoked by uid 500); 5 Nov 2010 18:25:36 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 9846 invoked by uid 99); 5 Nov 2010 18:25:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Nov 2010 18:25:36 +0000 X-ASF-Spam-Status: No, hits=4.7 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of amansk@gmail.com designates 74.125.82.176 as permitted sender) Received: from [74.125.82.176] (HELO mail-wy0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Nov 2010 18:25:31 +0000 Received: by wyf19 with SMTP id 19so99172wyf.35 for ; Fri, 05 Nov 2010 11:25:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=rN3+Iz671elEoMigAVRcE3puwr/vi6i4wPAJOjg2N2I=; b=uVBBGwsa7c/PbeN0jczioP0alSq6lgSeL5evOs+bfZE0SpXuvGlpiHGEMtDJznsxgp lf2xu07Ibz9u9u+RohT5KP/06XrP0gI0/RNJ7aOSTQxlZWKb3a2diw3GzyxLJTHDxNlJ Th6ZYZdegRX4yah2BNTr8XdkTwTeMn4eZ4OoQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=a8NCT2nfaCwhMS+vEOkFhJK64XvkTcMCPLrJHTfPsWZOZDsncnOAu95UheXV3etHkB hVhDBOKM+8GJTgRhrbuVt8eiIcc8Q2bbRaG8W1q/K/oG1k2k38M588zlVeWE7taDC81v XQVHd5FQnFCTdJFHNCCu5Gg+gsARGJYAb1FNg= Received: by 10.227.134.129 with SMTP id j1mr2401282wbt.67.1288981509336; Fri, 05 Nov 2010 11:25:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.15.80 with HTTP; Fri, 5 Nov 2010 11:24:49 -0700 (PDT) In-Reply-To: References: <-2939752422834628449@unknownmsgid> From: Amandeep Khurana Date: Fri, 5 Nov 2010 11:24:49 -0700 Message-ID: Subject: Re: Memory config for Hadoop cluster To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e65bcd60fc6eb2049452647f --0016e65bcd60fc6eb2049452647f Content-Type: text/plain; charset=ISO-8859-1 On Fri, Nov 5, 2010 at 2:00 AM, Hemanth Yamijala wrote: > Hi, > > On Fri, Nov 5, 2010 at 2:23 PM, Amandeep Khurana wrote: > > Right. I meant I'm not using fair or capacity scheduler. I'm getting out > of > > memory in some jobs and was trying to optimize the memory settings, > number > > of tasks etc. I'm running 0.20.2. > > > > The first thing most people do for this is to tweak the child.opts > setting to give higher heap space to their map or reduce tasks. I > presume you've already done this ? If not, maybe worth a try. It's by > far the easiest way to fix the out of memory errors. > Yup, I've done those and also played around with the number of tasks.. I've been able to get jobs to go through without errors with them but I wanted to use these configs to make sure that if a particular job is taking more memory than the cluster can afford to give. > > > Why can't the mapred.job.map.memory.mb and mapred.job.reduce.memory.mb > > be not put in the mapred-site.xml and just default to the equivalent > cluster > > baked if they are not set in the job either? > > If these parameters are set in mapred-site.xml on all places - the > client, the job tracker and the task trackers and they are not being > set in the job, this should suffice. However, if they are not set on > any one of these places, they'd get submitted with the default value > of -1, and since these are job specific parameters, they would > override the preconfigured settings on the cluster. If you want to be > sure, you could mark the settings as 'final' on the job tracker and > the task trackers. Then any submission by the job would not override > the settings. > I see the following in the TT logs: 2010-11-05 09:28:54,307 WARN org.apache.hadoop.mapred.TaskTracker (main): TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. But the configs are present in the mapred-site.xmls all across the cluster.. The jobs are being submitted from the master node, so that takes care of the client part. I'm not sure why the configs arent getting populated. Thanks > Hemanth > > > > -Amandeep > > > > On Nov 5, 2010, at 1:43 AM, Hemanth Yamijala wrote: > > > > Hi, > > > > > > I'm not using any scheduler.. Dont have multiple jobs running at the same > > > > time on the cluster. > > > > > > That probably means you are using the default scheduler. Please note > > that the default scheduler does not have the ability to schedule tasks > > intelligently using the memory configuration parameters you specify. > > Could you tell us what you'd like to achieve ? > > > > The documentation here: http://bit.ly/cCbAab (and the link it has to > > similar documentation in the Cluster Setup guide) will probably shed > > more light on how the parameters should be used. Note that this is in > > Hadoop 0.21, and the names of the parameters are different, though you > > can see the correspondence with similar variables in Hadoop 0.20. > > > > Thanks > > Hemanth > > > > > > -Amandeep > > > > > > On Fri, Nov 5, 2010 at 12:21 AM, Hemanth Yamijala >wrote: > > > > > > Amadeep, > > > > > > Which scheduler are you using ? > > > > > > Thanks > > > > hemanth > > > > > > On Tue, Nov 2, 2010 at 2:44 AM, Amandeep Khurana > wrote: > > > > How are the following configs supposed to be used? > > > > > > mapred.cluster.map.memory.mb > > > > mapred.cluster.reduce.memory.mb > > > > mapred.cluster.max.map.memory.mb > > > > mapred.cluster.max.reduce.memory.mb > > > > mapred.job.map.memory.mb > > > > mapred.job.reduce.memory.mb > > > > > > These were included in 0.20 in HADOOP-5881. > > > > > > Now, here's what I'm setting only the following out of the above in my > > > > mapred-site.xml: > > > > > > mapred.cluster.map.memory.mb=896 > > > > mapred.cluster.reduce.memory.mb=1024 > > > > > > When I run job, I get the following error: > > > > > > > > TaskTree [pid=1958,tipID=attempt_201011012101_0001_m_000000_0] is > > > > running beyond memory-limits. Current usage : 1358553088bytes. Limit : > > > > -1048576bytes. Killing task. > > > > > > I'm not sure how it got the Limit as -1048576bytes... Also, what are the > > > > cluster.max params supposed to be set as? Are they the max on the entire > > > > cluster or on a particular node? > > > > > > -Amandeep > > > --0016e65bcd60fc6eb2049452647f--