whirr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Frank Scholten (JIRA)" <j...@apache.org>
Subject [jira] [Created] (WHIRR-384) Add Mahout as a service
Date Mon, 12 Sep 2011 21:17:09 GMT
Add Mahout as a service

                 Key: WHIRR-384
                 URL: https://issues.apache.org/jira/browse/WHIRR-384
             Project: Whirr
          Issue Type: New Feature
          Components: new service
    Affects Versions: 0.7.0
            Reporter: Frank Scholten
             Fix For: 0.7.0

Here is an initial patch to support Mahout as a Whirr service.

I created the role 'mahout-home' which can be used to install the binary Mahout distribution
on a Hadoop namenode.
By combining this role with configuration for a Hadoop cluster you can SSH into the namenode,
su to root and start running Mahout jobs via the mahout script immediately.

The 'mahout-home' role has two properties

Mahout version					whirr.mahout.version 
URL of the Mahout binary distribution tarball	whirr.mahout.tarball.url

Note that I used a snapshot version of Mahout for testing, revision 1169784, because there
were some problems with the Mahout script in 0.5 that have been fixed on trunk, see MAHOUT-680.
To test you can set the tarball property to this link http://dl.dropbox.com/u/13436484/mahout-distribution-0.6-SNAPSHOT.tar.gz

I used configure actions and the onBeforeConfigure(). If there is a better way to express
this with the Whirr API let me know.

Currently I am investigating a 'mahout-jar' role, which installs the Mahout examples job jar
under $HADOOP_HOME/lib on a tasktracer node. I already have some code for putting the jar
in place but when running a job from my local machine I still get ClassNotFoundExceptions.
I believe this is because Hadoop has already started before the jar is put in the lib dir,
so the jar won't be picked up, but I have to investigate some more. From WHIRR-221 I understood
that there is no support (yet?) for ordering of services but if you have an idea on how to
fix this let me know.

Comments and suggestions welcome!

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message