mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: What is content based recommendation, to you
Date Wed, 27 Jan 2010 02:15:31 GMT
On Tue, Jan 26, 2010 at 3:36 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> I define it a bit differently by redefining recommendations as machine
> learning.
>
> On Tue, Jan 26, 2010 at 1:44 PM, Sean Owen <srowen@gmail.com> wrote:
>
> > I would narrow and specify this, in the context of Mahout, to have a
> > collaborative filtering angle:
>

Since Ted (Mr. Machine Learning) wants to describe content-based
recommendations
as machine learning, and Sean (Mr. Taste/CF) goes and describes it it terms
of
collaborative filtering, I suppose I'll put on my "search guy" hat, and
describe it the
way I see it:

Items have attributes (e.g. text features), and users express preference for
some
attributes (e.g.  explicit entering of text keywords), and the recommender
(a.k.a.
search engine) returns a ranked list of items which take those preferences
and
find the best items which have some of those preferences.

Generalizing a bit beyond that example, users may not make explicit mention
of
certain attributes, but we may infer them from some other source (a user on
a
social network may have a profile, a member of a dating website may have
answered a questionnaire expressing some preferences, etc.) and use these
to generate a "query" against the recommender.

There is no need (although there may be much *utility*) in ever thinking
about
interactions between items (item-item similarity) or users.  Content-based
recommendations can act purely as a generalized search engine, where the
trick is just coming up with the search terms / query features to use for
each user.

An advantage of thinking of it this way means that you don't need to think
about "users" at all: you can have recommendations of items of type A
against items of type B:

  * on webpage (type W), you have certain set of features, and users come to
that
webpage, sometimes with no prior history, so if you want to recommend
(serve)
ads (type A) to the user, recommending based purely on some kind of
content-based
correlation between items of type W and A can work.

  * on a job board, recruiters can post job listings (type J), and you want
to recommend
possible resumes (type R) to the job (*not* to the recruiter, because the
recruiter has
distinctly different "preferences" for each job - the *job* is the thing
which wants
recommendations).

In both of these cases, you can do a full-fledged recommendation engine with
no
users whatsoever, with content and item information across multiple domains.

The other advantage of thinking of content-based recommender systems this
way
is that now you have an entirely new axis to think about: CF goes one way,
and
content-based "searching" goes another, and there is an entire spectrum of
"fusion"
models which mix the two.

(of course, this leaves out one further piece of information which is
similar to CF,
but deserves its own treatment: explicit link information, available in the
form of
web-graph links, or social network links - recommenders based on this
information
can look a lot like CF, but it's using *explicit* user-user or item-item
correlations
instead of based implicitly due to co-occurrence / usage).

  -jake

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message