hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@apache.org>
Subject Re: bringing the codebases back in line
Date Fri, 22 Oct 2010 00:10:57 GMT
On Thu, Oct 21, 2010 at 05:53PM, Ian Holsman wrote:
> In discussing it with people, I've heard that a major issue (not the only
> one i'm sure) is lack of resources to actually test the apache releases on
> large clusters, and that it is very hard getting this done in short cycles
> (hence the large gap between 20.x and 21).

I do agree the lack of resources for testing Hadoop is a problem. However,
there might be some slight difference in the meaning of word 'resources' ;)

The only way, IMO, to have a reasonable testing done on a system as complex as
Hadoop is to invest into automatic validation of builds at system level. This
requires a few things (resources, if you will):
  - extra hardware (the easiest and cheapest problem)
  - automatic deployment, testing, and analysis
  - system tests development which able to control and observe a cluster
    behavior (in other words something more sophisticated than just shell
    scripts)

And for the semi-adequate system testing you don't need a large cluster: 10-20
nodes will be sufficient in most cases. But the automation of all the
processes starting from deployment is the key. Testing automation is in a
little better shape for Hadoop has that system test framework called Herriot
(part of Hadoop code base for about 7 months now), but it still needs further
extending.

Hopefully this briefs you about the cluster testing side of the issue.
  Cos

> So I thought I would start the thread to see if we could at least identify
> what the people think are the problems are.
> 
> 
> On Thu, Oct 21, 2010 at 3:30 PM, Allen Wittenauer
> <awittenauer@linkedin.com>wrote:
> 
> >
> > On Oct 21, 2010, at 12:13 PM, Ian Holsman wrote:
> >
> > > Hi guys.
> > >
> > > I wanted to start a conversation about how we could merge the the
> > cloudera +
> > > yahoo distribtutions of hadoop into our codebase,
> > > and what would be required.
> >
> >
> > *grabs popcorn*
> >
> >

Mime
View raw message