Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of balijamahesh.mca@gmail.com
 designates 209.85.217.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CACb0Fn7gTUm=N2kFSEAuq5nD3yEVvBNe8f-CfKjj9RNUo4egCw@mail.gmail.com>
References: 
 <CACb0Fn7gTUm=N2kFSEAuq5nD3yEVvBNe8f-CfKjj9RNUo4egCw@mail.gmail.com>
Date: Fri, 25 Jan 2013 14:48:00 +0530
Message-ID: 
 <CANiuQZdivp9ORP19J7hpOcgEdGWVbdNPTuyNQWzxwJeUtnPhiQ@mail.gmail.com>
Subject: Re: mappers-node relationship
From: Mahesh Balija <balijamahesh.mca@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=bcaec555540261fbe204d41967fb

--bcaec555540261fbe204d41967fb
Content-Type: text/plain; charset=ISO-8859-1

Mappers and Reducers will run in Task instances mapper/reducer instances
also called as mapper/reducer slots.
Each node can have multiple slots (I mean multiple mapper instances, each
run in a child JVM). And this is configurable with properties like
mapred.tasktracker.map.tasks.maximum and
mapred.tasktracker.reduce.tasks.maximum.
Also they run in parallel.

Best,
Mahesh Balija,
CalsoftLabs.


On Fri, Jan 25, 2013 at 1:16 PM, jamal sasha <jamalshasha@gmail.com> wrote:

> Hi.
>   A very very lame question.
> Does numbers of mapper depends on the number of nodes I have?
> How I imagine map-reduce is this.
> For example in word count example
> I have bunch of slave nodes.
> The documents are distributed across these slave nodes.
> Now depending on how big the data is, it will spread across the slave
> nodes.. and that is how my number of mappers are decided.
> I am sure, this is wrong understanding. As in pseudo-distributed node, i
> can see multiple mappers.
> So question is.. how does a single node machine runs multiple mappers? is
> it run in parallel or sequentially??
> Any resources to learn these
> Thanks
>

--bcaec555540261fbe204d41967fb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Mappers and Reducers will run in Task instances mapper/reducer instances al=
so called as mapper/reducer slots.<br>Each node can have multiple slots (I =
mean multiple mapper instances, each run in a child JVM). And this is confi=
gurable with properties like mapred.tasktracker.map.tasks.maximum and mapre=
d.tasktracker.reduce.tasks.maximum. <br>
Also they run in parallel. <br><br>Best,<br>Mahesh Balija,<br>CalsoftLabs.<=
br><br><br><br><div class=3D"gmail_quote">On Fri, Jan 25, 2013 at 1:16 PM, =
jamal sasha <span dir=3D"ltr">&lt;<a href=3D"mailto:jamalshasha@gmail.com" =
target=3D"_blank">jamalshasha@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Hi.<div>=A0 A very very lam=
e question.</div><div>Does numbers of mapper depends on the number of nodes=
 I have?</div>
<div>How I imagine map-reduce is this.</div><div>For example in word count =
example</div>
<div>I have bunch of slave nodes.</div><div>The documents are distributed a=
cross these slave nodes.</div><div>Now depending on how big the data is, it=
 will spread across the slave nodes.. and that is how my number of mappers =
are decided.</div>

<div>I am sure, this is wrong understanding. As in pseudo-distributed node,=
 i can see multiple mappers.</div><div>So question is.. how does a single n=
ode machine runs multiple mappers? is it run in parallel or sequentially??<=
/div>

<div>Any resources to learn these</div><div>Thanks</div></div>
</blockquote></div><br>

--bcaec555540261fbe204d41967fb--