I was going to wait on announcing this until I had more time to optimize and clean this up, but I created an AccumuloDataModel for mahout (specifically for recommendations). †I will be honest, this has not been†thoroughly†tested and recommendations using this are pretty slow. †I have some ideas for speeding it up, but haven't had time to implement them.


This should be the basic steps to getting this working.

git clone†git@github.com:jt6211/mahout.git
cd mahout
git checkout origin/accumulo -b†accumulo†# checkout my branch with the AccumuloDataModel
mvn compile package -DskipTests # tests seem to take forever, feel free not to skip them

Once done you will want to add†integration/target/mahout-integration-0.7-SNAPSHOT.jar to your classpath.

Feedback and pull requests would be welcomed.


On Tue, Mar 13, 2012 at 12:04 PM, Cardon, Tejay E <tejay.e.cardon@lmco.com> wrote:


Iím looking to use Accumulo as a data source for Mahout.† It doesnít appear to be built in, nor does Accumulo appear to include the code, but Iím hoping someone can point me at a blog post or something else that could help. †I appreciate whatever help I can get.



Follow me on Eureka and Brainstorm