Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of hemanty@thoughtworks.com
 designates 64.18.0.182 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <424548712-1358909647-cardhu_decombobulator_blackberry.rim.net-181512260-@b1.c16.bise7.blackberry>
References: 
 <CADokdcPpn1Px6TBXNDSnmTzKjFoWpHovsn9s+zJi80XkuQA10A@mail.gmail.com>
	<424548712-1358909647-cardhu_decombobulator_blackberry.rim.net-181512260-@b1.c16.bise7.blackberry>
Date: Wed, 23 Jan 2013 10:09:13 +0530
Message-ID: 
 <CAEAKFL_Kpzsrw+oTncUZS-dVfhQQ8x=GwcziypagpeX5MUVHcg@mail.gmail.com>
Subject: Re: Where do/should .jar files live?
From: Hemanth Yamijala <yhemanth@thoughtworks.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=e89a8ff252e0b281d004d3ed4699

--e89a8ff252e0b281d004d3ed4699
Content-Type: text/plain; charset=ISO-8859-1

On top of what Bejoy said, just wanted to add that when you submit a job to
Hadoop using the hadoop jar command, the jars which you reference in the
command on the edge/client node will be picked up by Hadoop and made
available to the cluster nodes where the mappers and reducers run.

Thanks
Hemanth


On Wed, Jan 23, 2013 at 8:24 AM, <bejoy.hadoop@gmail.com> wrote:

> **
> Hi Chris
>
> In larger clusters it is better to have an edge/client node where all the
> user jars reside and you trigger your MR jobs from here.
>
> A client/edge node is a server with hadoop jars and conf but hosting no
> daemons.
>
> In smaller clusters one DN might act as the client node and you can
> execute your jars from there. Here you have a risk of that DN getting
> filled if the files are copied to hdfs from this DN (as per block placement
> policy one replica would always be on this node)
>
>
> In oozie you put your executables into hdfs . But oozie comes at an
> integration level. In initial development phase, developers put jar into
> the LFS on client node, execute and test their code.
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: * Chris Embree <cembree@gmail.com>
> *Date: *Tue, 22 Jan 2013 14:24:40 -0500
> *To: *<user@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Where do/should .jar files live?
>
> Hi List,
>
> This should be a simple question, I think.  Disclosure, I am not a java
> developer. ;)
>
> We're getting ready to build our Dev and Prod clusters. I'm pretty
> comfortable with HDFS and how it sits atop several local file systems on
> multiple servers.  I'm fairly comfortable with the concept of Map/Reduce
> and why it's cool and we want it.
>
> Now for the question.  Where should my developers, put and store their jar
> files?  Or asked another way, what's the best entry point for submitting
> jobs?
>
> We have separate physical systems for NN, Checkpoint Node (formerly 2nn),
> Job Tracker and Standby NN.  Should I run from the JT node? Do I keep all
> of my finished .jar's on the JT local file system?
> Or should I expect that jobs will be run via Oozie?  Do I put jars on the
> local Oozie FS?
>
> Thanks in advance.
> Chris
>

--e89a8ff252e0b281d004d3ed4699
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On top of what Bejoy said, just wanted to add that when yo=
u submit a job to Hadoop using the hadoop jar command, the jars which you r=
eference in the command on the edge/client node will be picked up by Hadoop=
 and made available to the cluster nodes where the mappers and reducers run=
.<div>
<br></div><div style>Thanks</div><div style>Hemanth</div></div><div class=
=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Wed, Jan 23, 2013 at=
 8:24 AM,  <span dir=3D"ltr">&lt;<a href=3D"mailto:bejoy.hadoop@gmail.com" =
target=3D"_blank">bejoy.hadoop@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><u></u><div>Hi Chris<br><br>In larger cluste=
rs it is better to have an edge/client node where all the user jars reside =
and you trigger your MR jobs from here.<br>
<br>A client/edge node is a server with hadoop jars and conf but hosting no=
 daemons.<br><br>In smaller clusters one DN might act as the client node an=
d you can execute your jars from there.  Here you have a risk of that DN ge=
tting filled if the files are copied to hdfs from this DN (as per block pla=
cement policy one replica would always be  on this node)<br>
<br><br>In oozie you put your executables into hdfs . But oozie comes at an=
 integration level. In initial development phase, developers put jar into t=
he LFS on client node, execute and test their code.<br><div>Regards <br>
Bejoy KS<br><br>Sent from remote device, Please excuse typos</div><hr><div>=
<b>From: </b> Chris Embree &lt;<a href=3D"mailto:cembree@gmail.com" target=
=3D"_blank">cembree@gmail.com</a>&gt;
</div><div><b>Date: </b>Tue, 22 Jan 2013 14:24:40 -0500</div><div><b>To: </=
b>&lt;<a href=3D"mailto:user@hadoop.apache.org" target=3D"_blank">user@hado=
op.apache.org</a>&gt;</div><div><b>ReplyTo: </b> <a href=3D"mailto:user@had=
oop.apache.org" target=3D"_blank">user@hadoop.apache.org</a>
</div><div><b>Subject: </b>Where do/should .jar files live?</div><div><div =
class=3D"h5"><div><br></div>Hi List,<div><br></div><div>This should be a si=
mple question, I think. =A0Disclosure, I am not a java developer. ;)</div>
<div><br></div><div>We&#39;re getting ready to build our Dev and Prod clust=
ers. I&#39;m pretty comfortable with HDFS and how it sits atop several loca=
l file systems on multiple servers. =A0I&#39;m fairly comfortable with the =
concept of Map/Reduce and why it&#39;s cool and we want it.=A0</div>

<div><br></div><div>Now for the question. =A0Where should my developers, pu=
t and store their jar files? =A0Or asked another way, what&#39;s the best e=
ntry point for submitting jobs?</div><div><br></div><div>We have=A0separate=
=A0physical systems for NN, Checkpoint Node (formerly 2nn), Job Tracker and=
 Standby NN. =A0Should I run from the JT node? Do I keep all of my finished=
 .jar&#39;s on the JT local file system? =A0</div>

<div>Or should I expect that jobs will be run via Oozie? =A0Do I put jars o=
n the local Oozie FS?=A0</div><div><br></div><div>Thanks in advance.=A0</di=
v><div>Chris</div>

</div></div></div></blockquote></div><br></div>

--e89a8ff252e0b281d004d3ed4699--