hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: Build/test infrastructure
Date Sun, 27 Feb 2011 09:32:27 GMT
On 2/26/11 7:10 PM, "Konstantin Boudnik" <cos@apache.org> wrote:

> On Sat, Feb 26, 2011 at 05:38PM, Eric Yang wrote:
> Furthermore, my Puppet knowledge is very limited and I am for sure no expert
> in maven. I have some concern however:
>   - how to provide privileged access
>   - how and where to store host configurations (i.e. packages names, versions,
>     which are gonna be different for difference OSes)
>   - how to do native packages (see above example) and native dependency
>     management from maven? With shell scripting?
>   - how to maintain such a construct?
> I can continue for a long time, but I'd rather want to solve an issue of
> managing build host configurations/package sets in a most efficient and
> sustainable manner.

Hudson already supports chroot jail environment.  It is easy to setup
privileged access in the jailed environment by giving the hudson running
user sudo access to the jailed environment.  The host configuration can be
mirrored into chroot environment with minimum set of shell commands.

> Sorry, but Maven + shell script can be called simplification only in a pipe
> dream ;) Maven is a build tool. A relatively good one perhaps, but just a
> build tool. Certainly everything can be done with a combination of a shell
> scripting + tar balls and a little SSH sugar topping. But I'd rather use a
> accurately designed and supported tool (Puppet, Chef, etc.).

Maven supports various kind of remote deployment plugin.  Exec plugin with
shell script is the easiest one to implement.  There are also plugin like
cargo for more complex container deployment.  There is a plan to write a
deployment framework for hadoop for large scale deployment.  This project is
in planning stage.  The scope is deploying the entire hadoop stack (hdfs,
mr, zookeeper, hbase, pig, hive, and chukwa) to multiple large clusters.
Similar to what you are planning except at the scale that it would make
sense to use puppet+mcollective.  We had done the evaluation, and found per
puppet master would not scale well after 1800 nodes, and multilayer
puppeteer spamming tree to cover all our nodes, is not ideal.  We choose to
use chef-solo for edge deployment.  The rest of the details are to be worked
out.  This is the reason that I am interested on the test environment that
is being planned here.  It will be possible to use "to be invented"
framework in the hudson.  This system is not going to be grown in ant/maven
build script, hence it will be better to keep build system simple for now.


> And BTW - Hadoop builds aren't maven'ized yet. Which renders most of the
> argument a time waste until that problem is solved.
> At any rate, HADOOP-7157 is the JIRA for this. Please comment on it.
> Cos
>> Regards,
>> Eric
>>> You don't need to setup puppet muster in order to bounce a node. Puppet
>>> works
>>> i a client-only mode just as perfect.
>>> Cos
>>>> packaging only, but express my opinions on improving build system and
>>>> making
>>>> the system easier to reproduce.
>>>> Regards,
>>>> Eric
>>>> On 2/26/11 2:18 PM, "Konstantin Boudnik" <cos@apache.org> wrote:
>>>> This discussion isn't about build of the product nor about packaging
>>>> of it. We are discussing patch validation and snapshot build
>>>> infrastructure.
>>>> On Sat, Feb 26, 2011 at 12:43, Eric Yang <eyang@yahoo-inc.com> wrote:
>>>>> We should be very careful about the approach that we chosen for
>>>>> build/packaging.  The current state of hadoop is coupled together due
>>>>> lack of standardized RPC format.  Once this issue is cleared, the
>>>>> community will want to split hdfs and m/r into separated projects at
>>>>> point.  It may be better to ensure project is modularized, and work from
>>>>> the same svn repository.  Maven is great for doing this, and most of
>>>>> build and scripts can be defined in pom.xml.  Deployment/test server
>>>>> configuration can be pass in from hudson.  We should ensure that build
>>>>> deployment script do not further couple the project.
>>>>> Regards,
>>>>> Eric
>>>>> On 2/26/11 11:14 AM, "Konstantin Boudnik" <cos@apache.org> wrote:
>>>>> On Fri, Feb 25, 2011 at 23:47, Nigel Daley <ndaley@mac.com> wrote:
>>>>>> +1.
>>>>>> Once HADOOP-7106 is committed, I'd like to propose we create a directory
>>>>>> at
>>>>>> the same level of common/hdfs/mapreduce to hold build (and deploy)
>>>>>> scripts and files.  These would then get branches/tagged with the
rest of
>>>>>> the release.
>>>>> That makes sense, although I don't see changes of the host
>>>>> configurations to happen very often.
>>>>> Cos
>>>>>> Nige
>>>>>> On Feb 25, 2011, at 7:55 PM, Konstantin Boudnik wrote:
>>>>>>> Looking at re-occurring build/test-patch problems on hadoop?
>>>>>>> machines I
>>>>>>> thought of a way to make them:
>>>>>>>  a) all the same (configuration, installed software wise)
>>>>>>>  b) have an effortless system to run upgrades/updates on all
of them in
>>>>>>> a
>>>>>>>  controlled fashion.
>>>>>>> I would suggest to create Puppet configs (the exact content to
>>>>>>> defined)
>>>>>>> which we'll be checked in SCM (e.g. SVN), Whenever a build host's
>>>>>>> software
>>>>>>> is needed to be restored/updated a simple run of Puppet across
>>>>>>> machines
>>>>>>> or change in config and run of Puppet will do the magic for us.
>>>>>>> If there are no objections from the community I can put together
>>>>>>> Puppet recipes which might be evolved as we go.
>>>>>>> --
>>>>>>> Take care,
>>>>>>>       Cos
>>>>>>> 2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
>>>>>>> After all, it is only the mediocre who are always at their best.
>>>>>>>                Jean Giraudoux

View raw message