hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@apache.org>
Subject Re: Build/test infrastructure
Date Sun, 27 Feb 2011 03:10:38 GMT
On Sat, Feb 26, 2011 at 05:38PM, Eric Yang wrote:
> On 2/26/11 4:34 PM, "Konstantin Boudnik" <cos@apache.org> wrote:
> 
> > Apparently you are talking about something else, but I will bite...
> > 
> > On Sat, Feb 26, 2011 at 04:03PM, Eric Yang wrote:
> >> The proposed test automation process hasn't been thought through.   Apache
> >> Hudson has been setup to trigger patch builds, and setup pre-commit test
> >> environment.  Unfortunately, the current setup needs refinement with proper
> >> source code setup to make the builds working again.  Ideally, the test cycle
> >> have a commit build which runs simple unit tests, and a secondary build
> >> (every 24 hours) to run more through tests on multiple machine setup.  The
> >> test cluster should be cleansed after every secondary build, and ideally
> > 
> > We don't have a test cluster for Apache Hadoop validation. All I am focusing
> > on is build and patch validation infrastructure.
> 
> If the plan is using puppet agent without puppet master for configuring the
> system locally to test patch builds.  It is probably using the wrong tool
> for the job.  The value of puppet is to be able to configure heterogeneous
> services across machines in a consistent manner.  Is there plan to deploy

This is simply not the only value of the tool. It allows to maintain OS
configurations and system packages installation as easy for 1 host as for 1000
of them. Here's one of a many examples
    http://hstack.org/hstack-automated-deployment-using-puppet/
BTW, Puppet and Chef recipes are very widely used by all sorts of Ops and
cluster management companies. Perhaps, Maven and shell too - I'm not in a
position to make a judgement call. I'll let Y! Grid Ops to comment on it -
they know everything about sizable clusters configuration management and tools
for the job.

> multiple services across machines?  If the purpose is using puppet for
> config templates, ant or maven can do the job equally well.

Are you suggesting that it is easier to install patch and gcc packages of
version X.X.Z from a Maven build than from Puppet or Chef? If so - please cut
such a patch for the community to review. That'd be great a great
contribution!

Furthermore, my Puppet knowledge is very limited and I am for sure no expert
in maven. I have some concern however:
  - how to provide privileged access
  - how and where to store host configurations (i.e. packages names, versions,
    which are gonna be different for difference OSes)
  - how to do native packages (see above example) and native dependency
    management from maven? With shell scripting?
  - how to maintain such a construct?
I can continue for a long time, but I'd rather want to solve an issue of
managing build host configurations/package sets in a most efficient and
sustainable manner.

In a properly designed CI system build shouldn't be responsible to configure
its operation environment. It might and should check if everything is in place
(and crash/report accordingly). But if my Ant script goes around to download,
install and god forbids compiles some chunks of my OS I soon will end up with
an elegance of Python or some such.

> > Doing deployment from a build system is certainly possible, but is suboptimal
> > because it pollutes the build with HW/OS details, deployment scripts and such.
> > Besides, last time I've checked Hadoop was built by Ant.
> 
> Deploy to remote machine can be as simple as scp tarball, extra, apply
> template, and run it.  None of this requires puppet.  Instead of ant +
> puppet combination, the patch test build structure could be simplified by
> using maven + shell scripts.

Sorry, but Maven + shell script can be called simplification only in a pipe
dream ;) Maven is a build tool. A relatively good one perhaps, but just a
build tool. Certainly everything can be done with a combination of a shell
scripting + tar balls and a little SSH sugar topping. But I'd rather use a
accurately designed and supported tool (Puppet, Chef, etc.).

And BTW - Hadoop builds aren't maven'ized yet. Which renders most of the
argument a time waste until that problem is solved.

At any rate, HADOOP-7157 is the JIRA for this. Please comment on it.

Cos

> Regards,
> Eric
> 
> > You don't need to setup puppet muster in order to bounce a node. Puppet works
> > i a client-only mode just as perfect.
> > 
> > Cos
> > 
> 
> >> packaging only, but express my opinions on improving build system and making
> >> the system easier to reproduce.
> >> 
> >> Regards,
> >> Eric
> >> 
> >> On 2/26/11 2:18 PM, "Konstantin Boudnik" <cos@apache.org> wrote:
> >> 
> >> This discussion isn't about build of the product nor about packaging
> >> of it. We are discussing patch validation and snapshot build
> >> infrastructure.
> >> 
> >> On Sat, Feb 26, 2011 at 12:43, Eric Yang <eyang@yahoo-inc.com> wrote:
> >>> We should be very careful about the approach that we chosen for
> >>> build/packaging.  The current state of hadoop is coupled together due to
> >>> lack of standardized RPC format.  Once this issue is cleared, the
> >>> community will want to split hdfs and m/r into separated projects at some
> >>> point.  It may be better to ensure project is modularized, and work from
> >>> the same svn repository.  Maven is great for doing this, and most of the
> >>> build and scripts can be defined in pom.xml.  Deployment/test server
> >>> configuration can be pass in from hudson.  We should ensure that build and
> >>> deployment script do not further couple the project.
> >>> 
> >>> Regards,
> >>> Eric
> >>> 
> >>> On 2/26/11 11:14 AM, "Konstantin Boudnik" <cos@apache.org> wrote:
> >>> 
> >>> On Fri, Feb 25, 2011 at 23:47, Nigel Daley <ndaley@mac.com> wrote:
> >>>> +1.
> >>>> 
> >>>> Once HADOOP-7106 is committed, I'd like to propose we create a directory
at
> >>>> the same level of common/hdfs/mapreduce to hold build (and deploy) type
> >>>> scripts and files.  These would then get branches/tagged with the rest
of
> >>>> the release.
> >>> 
> >>> That makes sense, although I don't see changes of the host
> >>> configurations to happen very often.
> >>> 
> >>> Cos
> >>> 
> >>>> Nige
> >>>> 
> >>>> On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote:
> >>>> 
> >>>>> Looking at re-occurring build/test-patch problems on hadoop? build
> >>>>> machines I
> >>>>> thought of a way to make them:
> >>>>>  a) all the same (configuration, installed software wise)
> >>>>>  b) have an effortless system to run upgrades/updates on all of
them in a
> >>>>>  controlled fashion.
> >>>>> 
> >>>>> I would suggest to create Puppet configs (the exact content to be
defined)
> >>>>> which we'll be checked in SCM (e.g. SVN), Whenever a build host's
software
> >>>>> is needed to be restored/updated a simple run of Puppet across the
> >>>>> machines
> >>>>> or change in config and run of Puppet will do the magic for us.
> >>>>> 
> >>>>> If there are no objections from the community I can put together
some
> >>>>> Puppet recipes which might be evolved as we go.
> >>>>> 
> >>>>> --
> >>>>> Take care,
> >>>>>       Cos
> >>>>> 2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
> >>>>> 
> >>>>> After all, it is only the mediocre who are always at their best.
> >>>>>                Jean Giraudoux
> >>>> 
> >>>> 
> >>> 
> >>> 
> >> 
> 

Mime
View raw message