hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: built hadoop! please help with next steps?
Date Fri, 31 May 2013 22:22:44 GMT
I've been successful with importing all the leaf-level maven projects as
"Existing Maven Projects" using the eclipse maven plugin.  I've also gotten
things to work without the eclipse maven plugin with some combination of
mvn eclipse:eclipse, pointing to the m2repo, and the directory with the top
pom.xml as my eclipse workspace directory.


On Fri, May 31, 2013 at 3:18 PM, John Lilley <john.lilley@redpoint.net>wrote:

>  Sandy,****
>
> Thanks for all of the tips, I will try this over the weekend.   Regarding
> the last question, I am still trying to get the source loaded into Eclipse
> in a manner that facilitates easier browsing, symbol search, editing, etc.
> Perhaps I am just missing some obvious FAQ?  This is leading up to
> modifying and debugging the “shell” ApplicationMaster sample.  This page:*
> ***
>
>
> http://stackoverflow.com/questions/11007423/developing-testing-and-debugging-hadoop-map-reduce-jobs-with-eclipse
> ****
>
> looks promising as a Hadoop-in-Eclipse strategy, but it is over a year old
> and I’m not sure if it applies to Hadoop 2.0 and YARN.****
>
> John****
>
> ** **
>
> *From:* Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> *Sent:* Friday, May 31, 2013 12:13 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: built hadoop! please help with next steps?****
>
> ** **
>
> Hi John,****
>
> ** **
>
> Here's how I deploy/debug Hadoop locally:****
>
> To build and tar Hadoop:****
>
> ** **
>
>   mvn clean package -Pdist -Dtar -DskipTests=true****
>
> ** **
>
> The tar will be located in the project directory under
> hadoop-dist/target/.  I untar it into my deploy directory.****
>
> ** **
>
> I then copy these scripts into the same directory:****
>
> ** **
>
> hadoop-dev-env.sh:****
>
> ---****
>
> #!/bin/bash****
>
> export HADOOP_DEV_HOME=`pwd`****
>
> export HADOOP_MAPRED_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}****
>
> export YARN_HOME=${HADOOP_DEV_HOME}****
>
> export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop****
>
> ** **
>
> hadoop-dev-setup.sh:****
>
> ---****
>
> #!/bin/bash****
>
> source ./hadoop-dev-env.sh****
>
> bin/hadoop namenode -format****
>
> ** **
>
> hadoop-dev.sh:****
>
> ---****
>
> source hadoop-dev-env.sh****
>
> sbin/hadoop-daemon.sh $1 namenode****
>
> sbin/hadoop-daemon.sh $1 datanode****
>
> sbin/yarn-daemon.sh $1 resourcemanager****
>
> sbin/yarn-daemon.sh $1 nodemanager****
>
> sbin/mr-jobhistory-daemon.sh $1 historyserver****
>
> sbin/httpfs.sh $1****
>
> ** **
>
> I copy all the files in <deploy directory>/conf into my conf directory,
> <deploy directory>/etc/hadoop, and then copy the minimal site configuration
> into .  The advantage of using a directory that's not the /conf directory
> is that it won't be overwritten the next time you untar a new build.
>  Lastly, I copy the minimal site configuration into the conf files.  For
> the sake of brevity, I won't include the properties in full xml format, but
> here are the ones I set:****
>
> ** **
>
> yarn-site.xml:****
>
>   yarn.nodemanager.aux-services = mapreduce.shuffle****
>
>   yarn.nodemanager.aux-services.mapreduce.shuffle.class
> = org.apache.hadoop.mapred.ShuffleHandler****
>
>   yarn.resourcemanager.scheduler.class
> = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
> ****
>
> mapred-site.xml:****
>
>   mapreduce.framework.name = yarn****
>
> core-site.xml:****
>
>   fs.default.name = hdfs://localhost:9000****
>
> hdfs-site.xml:****
>
>   dfs.replication = 1****
>
>   dfs.permissions = false****
>
> ** **
>
> Then, to format HDFS and start our cluster, we can simply do:****
>
> ./hadoop-dev-setup.sh****
>
> ./hadoop-dev.sh start****
>
> To stop it:****
>
> ./hadoop-dev.sh stop****
>
> ** **
>
> Once I have this set up, for quicker iteration, I have some scripts that
> build submodules (sometimes all of mapreduce, sometimes just the
> resourcemanager) and copy the updated jars into my setup.****
>
> ** **
>
> Regarding your last question, are you saying that you were able to load it
> into Eclipse already, and want tips on the best way to browse within it?
>  Or that you're trying to get the source loaded into Eclipse?****
>
> ** **
>
> Hope that helps!****
>
> Sandy****
>
> On Thu, May 30, 2013 at 9:32 AM, John Lilley <john.lilley@redpoint.net>
> wrote:****
>
> Thanks for help me to build Hadoop!  I’m through compile and install of
> maven plugins into Eclipse.  I could use some pointers for next steps I
> want to take, which are:****
>
> ·         Deploy the simplest “development only” cluster (single node?)
> and learn how to debug within it.  I read about the “local runner”
> configuration here (
> http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms), does that
> still apply to MR2/YARN?  It seems like an old page; perhaps there is a
> newer FAQ?****
>
> ·         Build and run the ApplicationMaster “shell” sample, and use
> that as a starting point for a customer AM.  I would much appreciate any
> advice on getting the edit/build/debug cycle ironed out for an AM.****
>
> ·         Setup Hadoop source for easier browsing and learning (Eclipse
> load?).  What is typically done to make for easy browsing of referenced
> classes/methods by name?****
>
>  ****
>
> Thanks****
>
> John****
>
>  ****
>
> ** **
>

Mime
View raw message