Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of acm@hortonworks.com
 designates 209.85.210.44 as permitted sender)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Apple Message framework v1084)
Subject: Re: config for high memory jobs does not work, please help.
From: Arun C Murthy <acm@hortonworks.com>
In-Reply-To: 
 <CABwphPvUCgoTJtioKna4cFFgpbcfvHafoJqoaeyoxj-U328jMw@mail.gmail.com>
Date: Fri, 18 Jan 2013 13:18:19 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <CF054951-FD65-495C-82F5-8FCFAB20A9FF@hortonworks.com>
References: 
 <CABwphPvUCgoTJtioKna4cFFgpbcfvHafoJqoaeyoxj-U328jMw@mail.gmail.com>
To: user@hadoop.apache.org

Take a look at the CapacityScheduler and 'High RAM' jobs where-by you =
can run M map slots per node and request, per-job, that you want N =
(where N =3D max(1, N, M)).

Some more info:
=
http://hadoop.apache.org/docs/stable/capacity_scheduler.html#Resource+base=
d+scheduling
=
http://hortonworks.com/blog/understanding-apache-hadoops-capacity-schedule=
r/

hth,
Arun

On Jan 18, 2013, at 12:05 PM, Shaojun Zhao wrote:

> Dear all,
>=20
> I know it is best to use small amount of mem in mapper and reduce.
> However, sometimes it is hard to do so. For example, in machine
> learning algorithms, it is common to load the model into mem in the
> mapper step. When the model is big, I have to allocate a lot of mem
> for the mapper.
>=20
> Here is my question: how can I config hadoop so that it does not fork
> too many mappers and run out of physical memory?
>=20
> My machines have 24G, and I have 100 of them. Each time, hadoop will
> fork 6 mappers on each machine, no matter what config I used. I really
> want to reduce it to what ever number I want, for example, just 1
> mapper per machine.
>=20
> Here are the config I tried. (I use streaming, and I pass the config
> in the command line)
>=20
> -Dmapred.child.java.opts=3D-Xmx8000m  <-- did not bring down the =
number of mappers
>=20
> -Dmapred.cluster.map.memory.mb=3D32000 <-- did not bring down the =
number
> of mappers
>=20
> Am I missing something here?
> I use Hadoop 0.20.205
>=20
> Thanks a lot in advance!
> -Shaojun

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/