Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of john.lilley@redpoint.net
 designates 206.225.164.223 as permitted sender)
From: John Lilley <john.lilley@redpoint.net>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: RE: built hadoop! please help with next steps?
Thread-Topic: built hadoop! please help with next steps?
Thread-Index: Ac5dUzyF6bTKBcl+R+WgLpfA87u/hgBEd1oAAAZpEWA=
Date: Fri, 31 May 2013 22:18:57 +0000
Message-ID: 
 <869970D71E26D7498BDAC4E1CA92226B658B75D3@MBX021-E3-NJ-2.exch021.domain.local>
References: 
 <869970D71E26D7498BDAC4E1CA92226B658B5060@MBX021-E3-NJ-2.exch021.domain.local>
 <CACBYxKKQEFd35u4F8VzZzXTA9Ds7oXWrjGq5zGWHkoepU7DjPA@mail.gmail.com>
In-Reply-To: 
 <CACBYxKKQEFd35u4F8VzZzXTA9Ds7oXWrjGq5zGWHkoepU7DjPA@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_869970D71E26D7498BDAC4E1CA92226B658B75D3MBX021E3NJ2exch_"
MIME-Version: 1.0

--_000_869970D71E26D7498BDAC4E1CA92226B658B75D3MBX021E3NJ2exch_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Sandy,
Thanks for all of the tips, I will try this over the weekend.   Regarding t=
he last question, I am still trying to get the source loaded into Eclipse i=
n a manner that facilitates easier browsing, symbol search, editing, etc.  =
Perhaps I am just missing some obvious FAQ?  This is leading up to modifyin=
g and debugging the "shell" ApplicationMaster sample.  This page:
http://stackoverflow.com/questions/11007423/developing-testing-and-debuggin=
g-hadoop-map-reduce-jobs-with-eclipse
looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old =
and I'm not sure if it applies to Hadoop 2.0 and YARN.
John

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Friday, May 31, 2013 12:13 PM
To: user@hadoop.apache.org
Subject: Re: built hadoop! please help with next steps?

Hi John,

Here's how I deploy/debug Hadoop locally:
To build and tar Hadoop:

  mvn clean package -Pdist -Dtar -DskipTests=3Dtrue

The tar will be located in the project directory under hadoop-dist/target/.=
  I untar it into my deploy directory.

I then copy these scripts into the same directory:

hadoop-dev-env.sh:
---
#!/bin/bash
export HADOOP_DEV_HOME=3D`pwd`
export HADOOP_MAPRED_HOME=3D${HADOOP_DEV_HOME}
export HADOOP_COMMON_HOME=3D${HADOOP_DEV_HOME}
export HADOOP_HDFS_HOME=3D${HADOOP_DEV_HOME}
export YARN_HOME=3D${HADOOP_DEV_HOME}
export HADOOP_CONF_DIR=3D${HADOOP_DEV_HOME}/etc/hadoop

hadoop-dev-setup.sh:
---
#!/bin/bash
source ./hadoop-dev-env.sh
bin/hadoop namenode -format

hadoop-dev.sh:
---
source hadoop-dev-env.sh
sbin/hadoop-daemon.sh $1 namenode
sbin/hadoop-daemon.sh $1 datanode
sbin/yarn-daemon.sh $1 resourcemanager
sbin/yarn-daemon.sh $1 nodemanager
sbin/mr-jobhistory-daemon.sh $1 historyserver
sbin/httpfs.sh $1

I copy all the files in <deploy directory>/conf into my conf directory, <de=
ploy directory>/etc/hadoop, and then copy the minimal site configuration in=
to .  The advantage of using a directory that's not the /conf directory is =
that it won't be overwritten the next time you untar a new build.  Lastly, =
I copy the minimal site configuration into the conf files.  For the sake of=
 brevity, I won't include the properties in full xml format, but here are t=
he ones I set:

yarn-site.xml:
  yarn.nodemanager.aux-services =3D mapreduce.shuffle
  yarn.nodemanager.aux-services.mapreduce.shuffle.class =3D org.apache.hado=
op.mapred.ShuffleHandler
  yarn.resourcemanager.scheduler.class =3D org.apache.hadoop.yarn.server.re=
sourcemanager.scheduler.fair.FairScheduler
mapred-site.xml:
  mapreduce.framework.name<http://mapreduce.framework.name> =3D yarn
core-site.xml:
  fs.default.name<http://fs.default.name> =3D hdfs://localhost:9000
hdfs-site.xml:
  dfs.replication =3D 1
  dfs.permissions =3D false

Then, to format HDFS and start our cluster, we can simply do:
./hadoop-dev-setup.sh
./hadoop-dev.sh start
To stop it:
./hadoop-dev.sh stop

Once I have this set up, for quicker iteration, I have some scripts that bu=
ild submodules (sometimes all of mapreduce, sometimes just the resourcemana=
ger) and copy the updated jars into my setup.

Regarding your last question, are you saying that you were able to load it =
into Eclipse already, and want tips on the best way to browse within it?  O=
r that you're trying to get the source loaded into Eclipse?

Hope that helps!
Sandy
On Thu, May 30, 2013 at 9:32 AM, John Lilley <john.lilley@redpoint.net<mail=
to:john.lilley@redpoint.net>> wrote:
Thanks for help me to build Hadoop!  I'm through compile and install of mav=
en plugins into Eclipse.  I could use some pointers for next steps I want t=
o take, which are:

*         Deploy the simplest "development only" cluster (single node?) and=
 learn how to debug within it.  I read about the "local runner" configurati=
on here (http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does t=
hat still apply to MR2/YARN?  It seems like an old page; perhaps there is a=
 newer FAQ?

*         Build and run the ApplicationMaster "shell" sample, and use that =
as a starting point for a customer AM.  I would much appreciate any advice =
on getting the edit/build/debug cycle ironed out for an AM.

*         Setup Hadoop source for easier browsing and learning (Eclipse loa=
d?).  What is typically done to make for easy browsing of referenced classe=
s/methods by name?

Thanks
John


--_000_869970D71E26D7498BDAC4E1CA92226B658B75D3MBX021E3NJ2exch_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p
	{mso-style-priority:99;
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
span.hoenzb
	{mso-style-name:hoenzb;}
span.EmailStyle19
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri","sans-serif";}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3D"EN-US" link=3D"blue" vlink=3D"purple">
<div class=3D"WordSection1">
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D">Sandy,<o:p></o:p></span><=
/p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D">Thanks for all of the tip=
s, I will try this over the weekend.&nbsp; &nbsp;Regarding the last questio=
n, I am still trying to get the source loaded into Eclipse in a manner
 that facilitates easier browsing, symbol search, editing, etc.&nbsp; Perha=
ps I am just missing some obvious FAQ?&nbsp; This is leading up to modifyin=
g and debugging the &#8220;shell&#8221; ApplicationMaster sample.&nbsp; Thi=
s page:<o:p></o:p></span></p>
<p class=3D"MsoNormal"><a href=3D"http://stackoverflow.com/questions/110074=
23/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse">ht=
tp://stackoverflow.com/questions/11007423/developing-testing-and-debugging-=
hadoop-map-reduce-jobs-with-eclipse</a><o:p></o:p></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D">looks promising as a Hado=
op-in-Eclipse strategy, but it is over a year old and I&#8217;m not sure if=
 it applies to Hadoop 2.0 and YARN.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D">John<o:p></o:p></span></p=
>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span><=
/p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:10.0pt;font-family:&quot=
;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span style=3D"font-s=
ize:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> Sandy Ry=
za [mailto:sandy.ryza@cloudera.com]
<br>
<b>Sent:</b> Friday, May 31, 2013 12:13 PM<br>
<b>To:</b> user@hadoop.apache.org<br>
<b>Subject:</b> Re: built hadoop! please help with next steps?<o:p></o:p></=
span></p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<div>
<p class=3D"MsoNormal">Hi John,<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">Here's how I deploy/debug Hadoop locally:<o:p></o:p>=
</p>
</div>
<div>
<div>
<p class=3D"MsoNormal">To build and tar Hadoop:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">&nbsp; mvn clean package -Pdist -Dtar -DskipTests=3D=
true<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">The tar will be located in the project directory und=
er hadoop-dist/target/. &nbsp;I untar it into my deploy directory.<o:p></o:=
p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">I then copy these scripts into the same directory:<o=
:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">hadoop-dev-env.sh:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">---<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">#!/bin/bash<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">export HADOOP_DEV_HOME=3D`pwd`<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">export HADOOP_MAPRED_HOME=3D${HADOOP_DEV_HOME}<o:p><=
/o:p></p>
</div>
<div>
<p class=3D"MsoNormal">export HADOOP_COMMON_HOME=3D${HADOOP_DEV_HOME}<o:p><=
/o:p></p>
</div>
<div>
<p class=3D"MsoNormal">export HADOOP_HDFS_HOME=3D${HADOOP_DEV_HOME}<o:p></o=
:p></p>
</div>
<div>
<p class=3D"MsoNormal">export YARN_HOME=3D${HADOOP_DEV_HOME}<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">export HADOOP_CONF_DIR=3D${HADOOP_DEV_HOME}/etc/hado=
op<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">hadoop-dev-setup.sh:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">---<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">#!/bin/bash<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">source ./hadoop-dev-env.sh<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">bin/hadoop namenode -format<o:p></o:p></p>
</div>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">hadoop-dev.sh:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">---<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">source hadoop-dev-env.sh<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">sbin/hadoop-daemon.sh $1 namenode<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">sbin/hadoop-daemon.sh $1 datanode<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">sbin/yarn-daemon.sh $1 resourcemanager<o:p></o:p></p=
>
</div>
<div>
<p class=3D"MsoNormal">sbin/yarn-daemon.sh $1 nodemanager<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">sbin/mr-jobhistory-daemon.sh $1 historyserver<o:p></=
o:p></p>
</div>
<div>
<p class=3D"MsoNormal">sbin/httpfs.sh $1<o:p></o:p></p>
</div>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">I copy all the files in &lt;deploy directory&gt;/con=
f into my conf directory, &lt;deploy directory&gt;/etc/hadoop, and then cop=
y the minimal site configuration into . &nbsp;The advantage of using a dire=
ctory that's not the /conf directory is that it won't
 be overwritten the next time you untar a new build. &nbsp;Lastly, I copy t=
he minimal site configuration into the conf files. &nbsp;For the sake of br=
evity, I won't include the properties in full xml format, but here are the =
ones I set:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">yarn-site.xml:<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">&nbsp; yarn.nodemanager.aux-services =3D&nbsp;mapred=
uce.shuffle<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">&nbsp; yarn.nodemanager.aux-services.mapreduce.shuff=
le.class =3D&nbsp;org.apache.hadoop.mapred.ShuffleHandler<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">&nbsp; yarn.resourcemanager.scheduler.class =3D&nbsp=
;org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler=
<o:p></o:p></p>
</div>
</div>
<div>
<p class=3D"MsoNormal">mapred-site.xml:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">&nbsp; <a href=3D"http://mapreduce.framework.name">m=
apreduce.framework.name</a> =3D yarn<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">core-site.xml:<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">&nbsp; <a href=3D"http://fs.default.name">fs.default=
.name</a> =3D&nbsp;hdfs://localhost:9000<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">hdfs-site.xml:<o:p></o:p></p>
</div>
<div>
<div>
<p class=3D"MsoNormal">&nbsp; dfs.replication =3D&nbsp;1<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">&nbsp; dfs.permissions =3D&nbsp;false<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
</div>
<div>
<p class=3D"MsoNormal">Then, to format HDFS and start our cluster, we can s=
imply do:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">./hadoop-dev-setup.sh<o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
<div>
<p class=3D"MsoNormal">./hadoop-dev.sh start<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">To stop it:<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal">./hadoop-dev.sh stop<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">Once I have this set up, for quicker iteration, I ha=
ve some scripts that build submodules (sometimes all of mapreduce, sometime=
s just the resourcemanager) and copy the updated jars into my setup.<o:p></=
o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">Regarding your last question, are you saying that yo=
u were able to load it into Eclipse already, and want tips on the best way =
to browse within it? &nbsp;Or that you're trying to get the source loaded i=
nto Eclipse?<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class=3D"MsoNormal">Hope that helps!<o:p></o:p></p>
</div>
<div>
<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Sandy<o:p></o:p></p>
<div>
<p class=3D"MsoNormal">On Thu, May 30, 2013 at 9:32 AM, John Lilley &lt;<a =
href=3D"mailto:john.lilley@redpoint.net" target=3D"_blank">john.lilley@redp=
oint.net</a>&gt; wrote:<o:p></o:p></p>
<div>
<div>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Thanks for help me to build Hadoop!&nbsp; I&#8217;m through compil=
e and install of maven plugins into Eclipse.&nbsp; I could use some pointer=
s for next steps I want to take, which are:<o:p></o:p></p>
<p><span style=3D"font-family:Symbol">&middot;</span><span style=3D"font-si=
ze:7.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span>Deploy the simplest &#8220;development only&#8221; cluster (single n=
ode?) and learn how to debug within it.&nbsp; I read about the &#8220;local=
 runner&#8221; configuration here (<a href=3D"http://wiki.apache.org/hadoop=
/HowToDebugMapReducePrograms" target=3D"_blank">http://wiki.apache.org/hado=
op/HowToDebugMapReducePrograms</a>),
 does that still apply to MR2/YARN?&nbsp; It seems like an old page; perhap=
s there is a newer FAQ?<o:p></o:p></p>
<p><span style=3D"font-family:Symbol">&middot;</span><span style=3D"font-si=
ze:7.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span>Build and run the ApplicationMaster &#8220;shell&#8221; sample, and =
use that as a starting point for a customer AM.&nbsp; I would much apprecia=
te any advice on getting the edit/build/debug cycle ironed out for an AM.<o=
:p></o:p></p>
<p><span style=3D"font-family:Symbol">&middot;</span><span style=3D"font-si=
ze:7.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span>Setup Hadoop source for easier browsing and learning (Eclipse load?)=
.&nbsp; What is typically done to make for easy browsing of referenced clas=
ses/methods by name?<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">&nbsp;<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto">Thanks<o:p></o:p></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto"><span style=3D"color:#888888">John<o:p></o:p></span></p>
<p class=3D"MsoNormal" style=3D"mso-margin-top-alt:auto;mso-margin-bottom-a=
lt:auto"><span style=3D"color:#888888">&nbsp;<o:p></o:p></span></p>
</div>
</div>
</div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
</div>
</div>
</body>
</html>

--_000_869970D71E26D7498BDAC4E1CA92226B658B75D3MBX021E3NJ2exch_--