hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chin-Jung Hsu <oxhead.l...@gmail.com>
Subject Re: Maven build YARN ResourceManager only
Date Sat, 13 Apr 2013 18:52:43 GMT
My ideal development workflow is:

1) Full build the whole project
    mvn package -Pdist -DskipTests -Dtar
2) Modify my scheduler implementation
3) Build sub-project 'hadoop-yarn-server-resourcemanager'
    mvn ???
4) Create a distribution tar.gz file
    mvn package -Pdist -DskipTests -Dtar
5) Deploy the above distribution to cluster
6) If the implementation has problems, go back to step 2

My questions are:

1) I don't know the right command for the step 3 in my workflow
2) It seems the step 4 builds the whole Hadoop project again.  Should I use
another command?

I tried to figure it out in BUILDING.txt, and also the Maven documentation
doesn't help too much.  Do I miss something important there?

Anyway, thank you, Chris.

+oxhead




On Sat, Apr 13, 2013 at 12:40 PM, Chris Nauroth <cnauroth@hortonworks.com>wrote:

> No problem!  I think yarn-dev is appropriate, so I'm removing user (bcc'd
> one last time).  The user list is focused on how to use Hadoop, and the
> *-dev lists are focused on how to develop Hadoop.
>
> What specific problem are you seeing when you try to compile
> hadoop-yarn-server-resourcemanager independently?  I'm going to take a
> guess that it can't find classes from its other dependencies in the Hadoop
> source tree.  To handle this, you can run the following from the top of the
> source tree:
>
> mvn clean install -DskipTests
>
> This will build the whole source tree and install the resulting jars into
> your local Maven repository.  Then, subsequent builds of individual
> submodules like hadoop-yarn-server-resourcemanager will link to the jars in
> your local Maven repository during their builds.
>
> When you pull in new changes from upstream, you may need to repeat the
> install.  For example, this would be required if someone added a new method
> in hadoop-yarn-common and changed hadoop-yarn-server-resourcemanager to
> call it.  (General rule of thumb: if your build breaks after pulling in new
> changes, try a fresh mvn clean install to see if that fixes it.)
>
> You may want to read the file BUILDING.txt in the root of the source tree,
> especially the section titled "Building components separately".  That file
> contains the same information and a lot of other helpful build tips.
>
> --Chris
>
>
> On Sat, Apr 13, 2013 at 7:10 AM, Chin-Jung Hsu <oxhead.list@gmail.com
> >wrote:
>
> > Hi Chris,
> >
> > Appreciate your help, and sorry for the crossposting on both 'user' and
> > 'yarn-dev'.  I first posted on yarn-dev and didn't see anything.  I then
> > thought I might not be able to post on that list.  That's why I posted it
> > again on 'user'.  Should I post this kind of questions on 'yarn-dev' or
> > 'user'?
> >
> > Right now, I cannot compile hadoop-yarn-server-resourcemanager
> > independently.  I have to use
> >
> > mvn package -Pdist -DskipTests -Dtar
> >
> > to compile the whole project.  What command is appropriate to this
> > scenario?
> >
> >
> > Thanks,
> > oxhead
> >
> >
> >
> >
> > On Sat, Apr 13, 2013 at 12:41 AM, Chris Nauroth <
> cnauroth@hortonworks.com>wrote:
> >
> >> I don't have an answer to your exact question, but I do have a different
> >> suggestion that prevents the need to do frequent rebuilds of the whole
> >> Hadoop source tree.  First, do a full build of the distribution tar.gz.
> >>  Extract it and set up a custom hadoop-env.sh for yourself.  Inside the
> >> hadoop-env.sh file, export the environment variables
> >> HADOOP_USER_CLASSPATH_FIRST=true and HADOOP_CLASSPATH= any classpath
> that
> >> you want to prepend before the classes loaded from the distribution.
>  For
> >> example, this is what I have in mine right now, because I'm mostly
> working
> >> on HDFS and NodeManager:
> >>
> >> export HADOOP_USER_CLASSPATH_FIRST=true
> >> HADOOP_REPO=~/git/hadoop-common
> >> export
> >>
> HADOOP_CLASSPATH=$HADOOP_REPO/hadoop-common-project/hadoop-common/target/classes:$HADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs/target/classes:$HADOOP_REPO/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/classes:$HADOOP_REPO/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/classes
> >>
> >> For your ResourceManager work, you could set up your HADOOP_CLASSPATH to
> >> point at your hadoop-yarn-server-resourcemanager/target/classes
> directory.
> >>  Then, source (.) this hadoop-env.sh in any shell that you're using to
> run
> >> hadoop commands.  The daemons will print their full classpath before
> >> launching, so you can check that to see if it worked.
> >>
> >> With all of this in place, you can keep recompiling just
> >> hadoop-yarn-server-resourcemanager whenever you make changes instead of
> the
> >> whole hadoop-common tree.  Does this help?
> >>
> >> Thanks,
> >> --Chris
> >>
> >>
> >> On Fri, Apr 12, 2013 at 8:46 PM, Chin-Jung Hsu <oxhead.list@gmail.com
> >wrote:
> >>
> >>> I am implementing my own YARN scheduler under 2.0.3-alpha.  Is that
> >>> possible to build only the ResourceManager project, and then create a
> >>> distribution tar.gz for the entire Hadoop project?  Right now, the
> >>> compiling time takes me about 9 minutes.
> >>>
> >>> Thanks,
> >>> oxhead
> >>>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message