hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bejoy.had...@gmail.com
Subject Re: Where do/should .jar files live?
Date Wed, 23 Jan 2013 02:54:11 GMT
Hi Chris

In larger clusters it is better to have an edge/client node where all the user jars reside
and you trigger your MR jobs from here.

A client/edge node is a server with hadoop jars and conf but hosting no daemons.

In smaller clusters one DN might act as the client node and you can execute your jars from
there.  Here you have a risk of that DN getting filled if the files are copied to hdfs from
this DN (as per block placement policy one replica would always be  on this node)

In oozie you put your executables into hdfs . But oozie comes at an integration level. In
initial development phase, developers put jar into the LFS on client node, execute and test
their code.

Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Chris Embree <cembree@gmail.com>
Date: Tue, 22 Jan 2013 14:24:40 
To: <user@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: Where do/should .jar files live?

Hi List,

This should be a simple question, I think.  Disclosure, I am not a java
developer. ;)

We're getting ready to build our Dev and Prod clusters. I'm pretty
comfortable with HDFS and how it sits atop several local file systems on
multiple servers.  I'm fairly comfortable with the concept of Map/Reduce
and why it's cool and we want it.

Now for the question.  Where should my developers, put and store their jar
files?  Or asked another way, what's the best entry point for submitting

We have separate physical systems for NN, Checkpoint Node (formerly 2nn),
Job Tracker and Standby NN.  Should I run from the JT node? Do I keep all
of my finished .jar's on the JT local file system?
Or should I expect that jobs will be run via Oozie?  Do I put jars on the
local Oozie FS?

Thanks in advance.

View raw message