mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yash Patel <yashpatel1...@gmail.com>
Subject Re: User based recommender
Date Fri, 28 Nov 2014 08:14:26 GMT
The mahout + search engine recommender seems what would be best for the
data i have.

Kindly get back to me at your earliest convenience.



Best Regards,
Yash Patel

On Thu, Nov 27, 2014 at 9:58 PM, Pat Ferrel <pat@occamsmachete.com> wrote:

> Mahout has several recommenders so no need to create one from components.
> They all make use of the similarity of preferences between users—that’s why
> they are in the category of collaborative filtering.
>
> Primary Mahout Recommenders:
> 1) Hadoop mapreduce item-based cooccurrence recommender. Creates all recs
> for all users. Uses “Mahout IDs"
> 2) ALS-WR hadoop mapreduce, uses matrix factorization to reduce noise in
> the data. Sometimes better for small data sets than #1. Uses “Mahout IDs"
> 3) Mahout + search engine: cooccurrence type. Extremely flexible, works
> with multiple actions (multi-modal), works for new users that have some
> history, has a scalable server (from the search engine) but is more
> difficult to integrate than #1 or #2. Uses your own ids and reads csv files.
>
> The rest of the data seems to apply either to the user or the item and so
> would be used in different ways. #1 an #2 can only use user id and item id
> but some post recommendation weighting or filtering can be applied. #3 can
> use multiple attributes in different ways. For instance if category is an
> item attribute you can create two actions, user-pref-for-an-item, and
> user-pref-for-a-category. Assuming you want to recommend an item (not
> category) you can create a cross-ccoccurrence indicator for the second
> action and use the data to make your item recs better. #3 is the only
> methods that supports this.
>
> Pick a recommender and we can help more with data prep.
>
>
> On Nov 26, 2014, at 1:34 PM, Yash Patel <yashpatel1230@gmail.com> wrote:
>
> Hello everyone,
>
> wow i am quite happy to see so many inputs from people.
>
> I apologize for not providing more details.
>
> Although this is not my complete dataset the fields i have chosen to use
> are:
>
> customer id - numeric
> item id - text
> postal code - text
> item category ´- text
> potential growth - text
> territory - text
>
>
> Basically i was thinking of finding similar users and recommending them
> items that users like them have bought but they haven't.
>
> Although i would very much like to hear your opinions as i am not so
> familiar with clustering,classifiers etc.
>
> I found that mahout takes sequence files converted into vectors but i
> couldn't understand how would i do it on my data specifically and more
> importantly make a recommender system out of it.
>
> Also i am wondering how to combine the importance of a specific customer
> through the potential growth attribute.
>
>
>
>
>
>
> Best Regards,
> Yash Patel
>
> On Wed, Nov 26, 2014 at 9:03 PM, Pat Ferrel <pat@occamsmachete.com> wrote:
>
> > All very good points but note that spark-itemsimilarity may take the
> input
> > directly since you specify column numbers for <UID><ITEMID><PREF_VALUE>
> >
> > On Nov 26, 2014, at 11:43 AM, parnab kumar <parnab.2007@gmail.com>
> wrote:
> >
> > kindly elaborate... your requirements... your dataset fields ...and what
> > you want to recommend to an user... Usually a set of item is recommended
> to
> > an user. In your case what are your items ?
> >
> > The standard input is <UID><ITEMID><PREF_VALUE> . Clearly your
data is
> not
> > in this format which will let you use directly the algorithms in Mahout.
> >
> > A little more info from your side will help us to give your the right
> > pointers.
> >
> > On Wed, Nov 26, 2014 at 7:16 PM, Yash Patel <yashpatel1230@gmail.com>
> > wrote:
> >
> >> Dear Mahout Team,
> >>
> >> I am a student new to machine learning and i am trying to build a user
> >> based recommender using mahout.
> >>
> >> My dataset is a csv file as an input but it has many fields as text and
> i
> >> understand mahout needs numeric values.
> >>
> >> Can you give me a headstart as to where i should start and what kind of
> >> tools i need to parse the text colummns,
> >>
> >> Also an idea on which classifiers or clustering methods i should use
> > would
> >> be highly appreciated.
> >>
> >>
> >> Best Regards;
> >> Yash Patel
> >>
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message