mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: TreeBasedRecommenders(Deprecated?)
Date Tue, 10 Jun 2014 20:57:54 GMT
Sahil,

You say:

Also the use of item-based collaborative filtering recommender turns out to

be time consuming.


In my experience, item-based systems tend to be the fastest ones.

Perhaps we mean different things.

What I mean is similar to the approach where indicator behaviors are
computed and searched using something like a traditional search engine.





On Tue, Jun 10, 2014 at 4:50 AM, Sahil Sharma <ssahil08@gmail.com> wrote:

> Hi,
>
> One place where tree based recommenders(that is using hierarchical
> clustering) might be useful is a cold start problem.  That is suppose a
> user has only bought a few items ( say 2 or 3)  It's kind of hard to
> capture that user's interests using a user-based collaborative filtering
> recommender.
> Also the use of item-based collaborative filtering recommender turns out to
> be time consuming.
> In such a setting it makes sense to cluster the items together ( using some
> clustering algorithm)  and then use the user's purchased item to
> recommend(based on which cluster those purchased items belong to).
> On Jun 10, 2014 4:41 PM, "Sebastian Schelter" <ssc@apache.org> wrote:
>
> > Hi Sahil,
> >
> > don't worry, you're not breaking any rules. We removed the tree-based
> > recommenders because we have never heard of anyone using them over the
> > years.
> >
> > --sebastian
> >
> > On 06/10/2014 09:01 AM, Sahil Sharma wrote:
> >
> >> Hi,
> >>
> >> Firstly I apologize if I'm breaking certain rules by mailing this way,
> I'm
> >> new to this and would appreciate any help I could get.
> >>
> >> I was just playing around with the tree-based Recommender ( which seems
> to
> >> be deprecated in the current version "for the lack of use" ) .
> >>
> >> Why was it deprecated?
> >>
> >> Also, I just looked at the code, and it seems to be doing a lot of
> >> redundant computations, for example we could store a matrix of
> >> cluster-cluster distances ( and hence avoid recomputing the closest
> >> clusters every time by updating the matrix whenever we merge two
> clusters)
> >> and also , when trying to determine the farthest distance based
> similarity
> >> between two clusters again the pair which realizes this could be stored
> ,
> >> and updated upon merging so that this computation need not to repeated
> >> again and again.
> >>
> >> Just wondering if this repeated computation was not a reason for
> >> deprecating the class ( since people might have found a slow recommender
> >> "lacking use" ) .
> >>
> >> Would be glad to hear the thoughts of others on this, and also implement
> >> an
> >> efficient version if the community agrees.
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message