mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jinyuan Zhou <>
Subject Re: Is is easy to switch from standalone to hadoop cluster for mahout recommender?
Date Mon, 26 Dec 2011 05:49:31 GMT
Merry Chrisams Sean,
Thanks for such fast response on this day.
I think I misunderstood the following description on mahout's home page.
For the "level of integration" I mentioned,  Here is what I am looking
for.  I don't need to know Hadoop when I build recommender via mahout API.
But when I run the recommender, I can ask mahout to run it via a Hadoop
Cluster. But as you said hadoop based and non hadoop based recommeder are
different. My question does not make any sense now.

I am tring to bring Mahout to my working place and I am actually reading
your book Mahout in Action. I think the support  for different kind of User
similarities as well as support for evaluating a recommender  will  really
save us a lot of time.  I  understand that  for a item based recommeder,
one can do pre-computation for ,  say, co-occurance matix through Hadoop.
I am still looking for a way to build a hadoop based recommender without
writing mappers or reducers.  I mean I want to be able to write code
almost as simple as the  following:

 DataModel model = new FileDataModel(new File("myrating.csv"));

    UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
    UserNeighborhood neighborhood =
      new NearestNUserNeighborhood(2, similarity, model);

    Recommender recommender = new GenericUserBasedRecommender(
        model, neighborhood, similarity);

I do believe that writing  customized  hadoop based components will most
likely  be necessary even if  what I expected above does exists.

Your time is  greatly appreciated.



On Sun, Dec 25, 2011 at 8:12 PM, Sean Owen <> wrote:

> I'm not sure quite what you are asking. No, it is not all built on top
> of Hadoop. If you run a Hadoop-based job on 1 node, it is easy to run
> it on 100 nodes. The non-Hadoop-based recommender is completely
> different from the Hadoop-based recommender and they are not
> interchangeable. I am not sure what you mean by "level of
> integration".
> On Sun, Dec 25, 2011 at 9:00 PM, Jinyuan Zhou <>
> wrote:
> > Hi,
> > I had a impression that mahout is build on top of Hadoop. For this I
> expect
> > that,  for  a recommender I build. After I run it successfuly with modest
> > data on on mahcine, I should be able to  run the same recommender with
> > Hadoop cluster for the purpose of handling huge data. What I expect is
> > that  mahout will allow me do some configuration about my remcomender and
> > Hadoop cluster and then it is good to run that with  power on Hadoop. Is
> > this true?  I know Hbase or big they are build on top of Hadoop, when
> they
> > run command the  useage of Hadoop is transparent to user. That is , the
> > contruction of hadoop job,  construction of job jar as well as hadoop
> > command for running the job in Hadoop are all trasparent to user. Does
> > Mahout support this level of  integeration with Hadoop.
> >
> > Thanks,
> >
> > Jack

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message