hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Yamijala <yhema...@thoughtworks.com>
Subject Re: Where do/should .jar files live?
Date Wed, 23 Jan 2013 04:39:13 GMT
On top of what Bejoy said, just wanted to add that when you submit a job to
Hadoop using the hadoop jar command, the jars which you reference in the
command on the edge/client node will be picked up by Hadoop and made
available to the cluster nodes where the mappers and reducers run.


On Wed, Jan 23, 2013 at 8:24 AM, <bejoy.hadoop@gmail.com> wrote:

> **
> Hi Chris
> In larger clusters it is better to have an edge/client node where all the
> user jars reside and you trigger your MR jobs from here.
> A client/edge node is a server with hadoop jars and conf but hosting no
> daemons.
> In smaller clusters one DN might act as the client node and you can
> execute your jars from there. Here you have a risk of that DN getting
> filled if the files are copied to hdfs from this DN (as per block placement
> policy one replica would always be on this node)
> In oozie you put your executables into hdfs . But oozie comes at an
> integration level. In initial development phase, developers put jar into
> the LFS on client node, execute and test their code.
> Regards
> Bejoy KS
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: * Chris Embree <cembree@gmail.com>
> *Date: *Tue, 22 Jan 2013 14:24:40 -0500
> *To: *<user@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Where do/should .jar files live?
> Hi List,
> This should be a simple question, I think.  Disclosure, I am not a java
> developer. ;)
> We're getting ready to build our Dev and Prod clusters. I'm pretty
> comfortable with HDFS and how it sits atop several local file systems on
> multiple servers.  I'm fairly comfortable with the concept of Map/Reduce
> and why it's cool and we want it.
> Now for the question.  Where should my developers, put and store their jar
> files?  Or asked another way, what's the best entry point for submitting
> jobs?
> We have separate physical systems for NN, Checkpoint Node (formerly 2nn),
> Job Tracker and Standby NN.  Should I run from the JT node? Do I keep all
> of my finished .jar's on the JT local file system?
> Or should I expect that jobs will be run via Oozie?  Do I put jars on the
> local Oozie FS?
> Thanks in advance.
> Chris

View raw message