Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0E091D232 for ; Tue, 30 Oct 2012 17:25:35 +0000 (UTC) Received: (qmail 71535 invoked by uid 500); 30 Oct 2012 17:25:30 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 71457 invoked by uid 500); 30 Oct 2012 17:25:30 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 71447 invoked by uid 99); 30 Oct 2012 17:25:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Oct 2012 17:25:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of acm@hortonworks.com designates 209.85.220.48 as permitted sender) Received: from [209.85.220.48] (HELO mail-pa0-f48.google.com) (209.85.220.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Oct 2012 17:25:23 +0000 Received: by mail-pa0-f48.google.com with SMTP id kp12so332067pab.35 for ; Tue, 30 Oct 2012 10:25:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer:x-gm-message-state; bh=u5x5jNoDZe+GmfqUPy9Zez+MfRSKJohoeAW4yhhuNJI=; b=k3dsTDUZDQ5LuL2SjuoNtsP4A4YP++n27ipebcSSx2hPJP6VE4DCxXMKs3X7QYSjeP tdKMcDaxbnBqawpUtUS8ZkPtQSAo8vWEXF+ctBrJbQkWOPGxIKJ+VRmNUfr55wvx6/5P vMh3jNX2HbeWfM6SpgoIPsUD8wrc7YzfmxHVlpaT4JGdSlNMqLmRFwb+Y/yG2Rwo/njm 24vtz9C+SNeeB14Xm7KvGPFuzScA+PmYj2/Qi1/scSJRlrD/OpGwqgVjdz2RICx/2Fv8 Co91zuorUqZstakY8l53Zt7KJxD/F9jSfP2ft/CZLBwFMlTidcwNHCaaFhS6gHEHhHi6 RFdA== Received: by 10.68.253.230 with SMTP id ad6mr30550714pbd.84.1351617902623; Tue, 30 Oct 2012 10:25:02 -0700 (PDT) Received: from [10.10.11.94] (host1.hortonworks.com. [70.35.59.2]) by mx.google.com with ESMTPS id ix9sm840465pbc.7.2012.10.30.10.24.49 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 30 Oct 2012 10:25:00 -0700 (PDT) From: Arun C Murthy Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-13--351465354 Subject: Re: Memory based scheduling Date: Tue, 30 Oct 2012 10:24:49 -0700 In-Reply-To: To: user@hadoop.apache.org References: Message-Id: <6FC1FBD9-E632-47AB-AA5C-D632C27FC22D@hortonworks.com> X-Mailer: Apple Mail (2.1084) X-Gm-Message-State: ALoCoQnfGn1eXip/hVWMOXX07PnuL+m6IKPUz5Nfju7j+YDYM4mYR/+FrBC8Kuuyopbv0kg36etG X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-13--351465354 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Not true, take a look at my prev. response. On Oct 30, 2012, at 9:08 AM, lohit wrote: > As far as I recall this is not possible. Per job or per user = configurations like these are little difficult in existing version. > What you could try is to set max map per job to be say half of cluster = capacity. (This is possible with FairSchedule, I do not know of = CapacityScheduler) > For eg, if you have 10 nodes with 4 slots each. You would create pool = and set max maps to be 20.=20 > JobTracker will try its best to spread tasks across nodes provided = they are empty slots. But again, this is not guaranteed.=20 >=20 >=20 > 2012/10/30 Marco Z=FChlke > Hi, >=20 > on our cluster our jobs usually satisfied with less than 2 GB of heap = space. > so we have on our 8 GB computers 3 maps maximum and on our 16 GB > computers 4 maps maximum (we only have quad core CPUs and to have > memory left for reducers). This works very well. >=20 > But now we have a new kind of jobs. Each mapper requires at lest 4 GB > of heap space. >=20 > Is it possible to limit the number of tasks (mapper) per computer to 1 = or 2 for > these kinds of jobs ? >=20 > Regards, > Marco >=20 >=20 >=20 >=20 > --=20 > Have a Nice Day! > Lohit -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ --Apple-Mail-13--351465354 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 Not = true, take a look at my prev. response.

On Oct 30, = 2012, at 9:08 AM, lohit wrote:

As far as = I recall this is not possible. Per job or per user configurations like = these are little difficult in existing version.
What you could try = is to set max map per job to be say half of cluster capacity. (This is = possible with FairSchedule, I do not know of CapacityScheduler)
For eg, if you have 10 nodes with 4 slots each. You would create = pool and set max maps to be 20. 
JobTracker will try its = best to spread tasks across nodes provided they are empty slots. But = again, this is not guaranteed. 


2012/10/30 Marco = Z=FChlke <mzuehlke@gmail.com>
Hi,

on our cluster our jobs usually satisfied with less than 2 GB = of heap space.
so we have on our 8 GB computers 3 maps maximum and on = our 16 GB
computers 4 maps maximum (we only have quad core CPUs and = to have
memory left for reducers). This works very well.

But now we have = a new kind of jobs. Each mapper requires at lest 4 GB
of heap = space.

Is it possible to limit the number of tasks (mapper) per = computer to 1 or 2 for
these kinds of jobs ?

Regards,
Marco




--
Have a = Nice Day!
Lohit

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

=

= --Apple-Mail-13--351465354--