Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of ashjain2@gmail.com designates
 74.125.82.43 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAEpEg_DqbpGkQNsVvgnXVOyctfD0ZNfBdep=MoaHEL1hwfz0oQ@mail.gmail.com>
References: 
 <CABGiY4CRHuFpuChmFTfEs2fTq8RSuHjXykm3Xs-s33CZV=ozZQ@mail.gmail.com>
	<CABGiY4DWZBxoneMTvS_KnB_NowUjU_heq28EC=PZTy0OtYsYDg@mail.gmail.com>
	<CAEpEg_DqbpGkQNsVvgnXVOyctfD0ZNfBdep=MoaHEL1hwfz0oQ@mail.gmail.com>
Date: Thu, 9 Jan 2014 14:11:14 +0530
Message-ID: 
 <CABGiY4BGZxJ3jmOHgbyzFpsERBxeYyrm9Q-u7CL4M3avpn0OUQ@mail.gmail.com>
Subject: Re: Distributing the code to multiple nodes
From: Ashish Jain <ashjain2@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=f46d0438957f85d41504ef859229

--f46d0438957f85d41504ef859229
Content-Type: text/plain; charset=ISO-8859-1

Hello Chris,

I have now a cluster with 3 nodes and replication factor being 2. When I
distribute a file I could see that there are replica of data available in
other nodes. However when I run a map reduce job again only one node is
serving all the request :(. Can you or anyone please provide some more
inputs.

Thanks
Ashish


On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <chris.mawata@gmail.com> wrote:

> 2 nodes and replication factor of 2 results in a replica of each block
> present on each node. This would allow the possibility that a single node
> would do the work and yet be data local.  It will probably happen if that
> single node has the needed capacity.  More nodes than the replication
> factor are needed to force distribution of the processing.
> Chris
> On Jan 8, 2014 7:35 AM, "Ashish Jain" <ashjain2@gmail.com> wrote:
>
>> Guys,
>>
>> I am sure that only one node is being used. I just know ran the job again
>> and could see that CPU usage only for one server going high other server
>> CPU usage remains constant and hence it means other node is not being used.
>> Can someone help me to debug this issue?
>>
>> ++Ashish
>>
>>
>> On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <ashjain2@gmail.com> wrote:
>>
>>> Hello All,
>>>
>>> I have a 2 node hadoop cluster running with a replication factor of 2. I
>>> have a file of size around 1 GB which when copied to HDFS is replicated to
>>> both the nodes. Seeing the block info I can see the file has been
>>> subdivided into 8 parts which means it has been subdivided into 8 blocks
>>> each of size 128 MB.  I use this file as input to run the word count
>>> program. Some how I feel only one node is doing all the work and the code
>>> is not distributed to other node. How can I make sure code is distributed
>>> to both the nodes? Also is there a log or GUI which can be used for this?
>>> Please note I am using the latest stable release that is 2.2.0.
>>>
>>> ++Ashish
>>>
>>
>>

--f46d0438957f85d41504ef859229
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div>Hello Chris,<br><br></div>I have now a cluster w=
ith 3 nodes and replication factor being 2. When I distribute a file I coul=
d see that there are replica of data available in other nodes. However when=
 I run a map reduce job again only one node is serving all the request :(. =
Can you or anyone please provide some more inputs.<br>
<br></div>Thanks<br>Ashish<br></div><div class=3D"gmail_extra"><br><br><div=
 class=3D"gmail_quote">On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:chris.mawata@gmail.com" target=3D"_blank">=
chris.mawata@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><p dir=3D"ltr">2 nodes and replication facto=
r of 2 results in a replica of each block present on each node. This would =
allow the possibility that a single node would do the work and yet be data =
local.=A0 It will probably happen if that single node has the needed capaci=
ty.=A0 More nodes than the replication factor are needed to force distribut=
ion of the processing. <br>
<span class=3D"HOEnZb"><font color=3D"#888888">

Chris</font></span></p><div class=3D"HOEnZb"><div class=3D"h5">
<div class=3D"gmail_quote">On Jan 8, 2014 7:35 AM, &quot;Ashish Jain&quot; =
&lt;<a href=3D"mailto:ashjain2@gmail.com" target=3D"_blank">ashjain2@gmail.=
com</a>&gt; wrote:<br type=3D"attribution"><blockquote class=3D"gmail_quote=
" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir=3D"ltr"><div><div>Guys,<br><br></div>I am sure that only one node =
is being used. I just know ran the job again and could see that CPU usage o=
nly for one server going high other server CPU usage remains constant and h=
ence it means other node is not being used. Can someone help me to debug th=
is issue?<br>


<br></div>++Ashish<br></div><div class=3D"gmail_extra"><br><br><div class=
=3D"gmail_quote">On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <span dir=3D"l=
tr">&lt;<a href=3D"mailto:ashjain2@gmail.com" target=3D"_blank">ashjain2@gm=
ail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div><div>Hello All,<b=
r><br></div>I have a 2 node hadoop cluster running with a replication facto=
r of 2. I have a file of size around 1 GB which when copied to HDFS is repl=
icated to both the nodes. Seeing the block info I can see the file has been=
 subdivided into 8 parts which means it has been subdivided into 8 blocks e=
ach of size 128 MB.=A0 I use this file as input to run the word count progr=
am. Some how I feel only one node is doing all the work and the code is not=
 distributed to other node. How can I make sure code is distributed to both=
 the nodes? Also is there a log or GUI which can be used for this?<br>


</div>Please note I am using the latest stable release that is 2.2.0.<span>=
<font color=3D"#888888"><br><br></font></span></div><span><font color=3D"#8=
88888">++Ashish<br></font></span></div>
</blockquote></div><br></div>
</blockquote></div>
</div></div></blockquote></div><br></div>

--f46d0438957f85d41504ef859229--