mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Schulte <johannes.schu...@gmail.com>
Subject Re: Clustering product views and sales
Date Tue, 07 May 2013 04:40:53 GMT
Hi!
As a starting point I remember this conversation containing both elements
(although the reconstruction part is rather small, hint!)

http://markmail.org/message/5cfewal3oyt6vw2k


On Tue, May 7, 2013 at 1:00 AM, Dominik Hübner <contact@dhuebner.com> wrote:

> One more thing for now @Ted:
> What do you refer to with sparsification and reconstruction?
>
> On May 7, 2013, at 12:19 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
>
> > Truly cold start is best handled by recommending the most popular items.
> >
> > If you know *anything* at all such as geo or browser or OS, then you can
> > use that to recommend using conventional techniques (that is, you can
> > recommend for the characteristics rather than for the person).
> >
> > Within a very few interactions, however, real recommendations will kick
> in.
> >
> > My lately preferred approach is to derive indicators using sparsification
> > or ALS+reconstruction.  These indicators can be historical items or
> static
> > items such as geo information.  These indicators can be combined in a
> > single step using a search engine.
> >
> >
> >
> >
> >
> >
> > On Mon, May 6, 2013 at 2:58 PM, Dominik Hübner <contact@dhuebner.com>
> wrote:
> >
> >> The cluster was mostly intended for tackling the cold start problem for
> >> new users.
> >> I want to build a recommender based on existing components or to be
> >> precise a combination of them.
> >>
> >> Unfortunately, the only product meta-data I currently have is the
> product
> >> price. Furthermore, this is a project
> >> I am working on alone. As a consequence, the approaches I can examine in
> >> the given time are limited.
> >>
> >> Would using ALS and ranking its outcome by e.g. frequent item set
> >> algorithms be something worth looking into?
> >> Or did you mean something different?
> >>
> >> My personal goal is to build a recommender providing acceptable results
> >> using the data I currently have available.
> >> Of course, this will only serve as a basis for further improvements
> where
> >> necessary or if further information can be obtained.
> >>
> >>
> >> On May 6, 2013, at 11:21 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> >>
> >>> Are you looking to build a product recommender based on your own
> design?
> >>> Or do you want to build one based on existing methods?
> >>>
> >>> If you want to use existing methods, clustering has essentially no
> role.
> >>>
> >>> I think that composite approaches that use item meta-data and different
> >>> kinds of behavioral cues are important to best performance.
> >>>
> >>>
> >>> On Mon, May 6, 2013 at 12:35 PM, Dominik Hübner <contact@dhuebner.com
> >>> wrote:
> >>>
> >>>> Well, as you already might have guessed, I am building a product
> >>>> recommender system for my thesis.
> >>>>
> >>>> I am planning to evaluate ALS (both, implicit and explicit) as well
as
> >>>> item -similarity recommendation for users with at least a few known
> >>>> products. Nevertheless, the majority of users only has seen a single
> (or
> >>>> 2-3) product(s). I want to recommend them the most popular items from
> >>>> clusters, their only product comes from (as a workaround for the
> >> cold-start
> >>>> problem). Furthermore, I expect to be able to see which "kind" of
> >> products
> >>>> users like. This might provide me some information about how well ALS
> >> and
> >>>> similarity recommenders fit the user's area of interest (an early
> >>>> evaluation) or at least to estimate if the chosen approach will work
> in
> >>>> some way.
> >>>>
> >>>> On May 6, 2013, at 9:09 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> >>>>
> >>>>> I don't even think that clustering is all that necessary.
> >>>>>
> >>>>> The reduced cooccurrence matrix will give you items related to each
> >> item.
> >>>>>
> >>>>> You can use something like PCA, but SVD is just as good here due
to
> >> near
> >>>>> zero mean.  You could SSVD or ALS from Mahout to do this analysis
and
> >>>> then
> >>>>> use k-means on the right singular vectors (aka item representation).
> >>>>>
> >>>>> What is the high level goal that you are trying to solve with this
> >>>>> clustering?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, May 6, 2013 at 12:01 PM, Dominik Hübner <
> contact@dhuebner.com
> >>>>> wrote:
> >>>>>
> >>>>>> And running the clustering on the cooccurrence matrix or doing
PCA
> by
> >>>>>> removing eigenvalues/vectors?
> >>>>>>
> >>>>>> On May 6, 2013, at 8:52 PM, Ted Dunning <ted.dunning@gmail.com>
> >> wrote:
> >>>>>>
> >>>>>>> On Mon, May 6, 2013 at 11:29 AM, Dominik Hübner <
> >> contact@dhuebner.com
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Oh, and I forgot how the views and sales are used to
build product
> >>>>>>>> vectors. As of now, I implemented binary vectors, vectors
counting
> >> the
> >>>>>>>> number of views and sales (e.g 1view=1count, 1sale=10counts)
and
> >>>>>> ordinary
> >>>>>>>> vectors ( view => 1, sale=>5).
> >>>>>>>>
> >>>>>>>
> >>>>>>> I would recommend just putting the view and sale in different
> columns
> >>>> and
> >>>>>>> doing cooccurrence analysis on this.
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message