mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <>
Subject Re: Recommeding on Dynamic Content
Date Wed, 02 Feb 2011 22:55:54 GMT
Actually, our case is even a little more complex: our hierarchy may be
A/[B/[C|D]], i.e. for some inputs full hierarchy is A/B/C and for some
inputs it is A/B/D, mutually exclusive. Technically, both hierarchies could
be re-learned independently; but it stands to reason that A and B learners
do not have to be re-learned independently just to save on the computation.

Ted has mentioned there's a hierarchy in Mahout, i wonder if it can handle
the case presented, and what class i might look at to see how to set this


On Wed, Feb 2, 2011 at 2:43 PM, Dmitriy Lyubimov <> wrote:

> both Elkan's work and Yahoo's paper are based on the notion (which is
> confirmed by SGD experience) that if we try to substitute missing data with
> neutral values, the whole learning falls apart. Sort of.
> I.e. if we always know some context A (in this case, static labels and
> dyadic ids) and only sometimes some context B, then assuming neutral values
> for context B if we are missing this data is invalid because we are actually
> substituting unknown data with made-up data. Which is why SGD produces
> higher errors than necessary on sparsified label data. this is also the
> reason why SVD recommenders produce higher errors over sparse sample data as
> well (i think that's  the consensus).
> However, thinking in offline-ish mode, if we learn based on samples with A
> data, then freeze the learner and learn based on error between frozen
> learner for A and only the input that has context B, for learner B, then we
> are not making the mistake per above. At no point our learner takes any
> 'made-up' data.
> This whole notion is based on Bayesian inference process: what can you say
> if you only know A; and what correction would you make if you also new B.
> Both papers do a corner case out of this: we have two types of data, A and
> B, and we learn A then freeze leaner A, then learn B where available.
> But general case doesn't have to be A and B. Actually that's our case (our
> CEO calls it 'trunk-brunch-leaf' case): We always know some context A, and
> sometimes B, and also sometimes we know all of A, B and some addiional
> context C.
> so there's a case to be made to generalize the inference architecture:
> specify hierarchy and then learn A/B/C, SGD+loglinear, or whatever else.
> -d
> On Wed, Feb 2, 2011 at 12:14 AM, Sebastian Schelter <>wrote:
>> Hi Ted,
>> I looked through the paper a while ago. The approach seems to have great
>> potential, especially because of the ability to include side information and
>> to work with nominal and ordinal data. Unfortunately I have to admit that a
>> lot of the mathematical details overextend my understanding. I'd be ready to
>> assist anyone willing to build a recommender from that approach but it's not
>> a thing I could tackle on my own.
>> --sebastian
>> PS: The algorithm took 7 minutes to learn from the movielens 1M dataset,
>> not Netflix.
>> On 01.02.2011 18:02, Ted Dunning wrote:
>>> Sebastian,
>>> Have you read the Elkan paper?  Are you interested in (partially) content
>>> based recommendation?
>>> On Tue, Feb 1, 2011 at 2:02 AM, Sebastian Schelter <<mailto:
>>>>> wrote:
>>>    Hi Gökhan,
>>>    I wanna point you to some papers I came across that deal with
>>>    similar problems:
>>>    "Google News Personalization: Scalable Online Collaborative
>>>    Filtering" ( ), this paper
>>>    describes how Google uses three algorithms (two of which cluster
>>>    the users) to achieve online recommendation of news articles.
>>>    "Feature-based recommendation system" (
>>> ),
>>>    this approach didn't really convince me and I think the paper is
>>>    lacking a lot of details, but it might still be an interesting read.
>>>    --sebastian
>>>    On 01.02.2011 00:26, Gökhan Çapan wrote:
>>>        Hi,
>>>        I've made a search, sorry in case this is a double post.
>>>        Also, this question may not be directly related to Mahout.
>>>        Within a domain which is enitrely user generated and has a
>>>        very big item
>>>        churn (lots of new items coming, while some others leaving the
>>>        system), what
>>>        do you recommend to produce accurate recommendations using
>>>        Mahout (Not just
>>>        Taste)?
>>>        I mean, as a concrete example, in the eBay domain, not Amazon's.
>>>        Currently I am creating item clusters using LSH with MinHash
>>>        (I am not sure
>>>        if it is in Mahout, I can contribute if it is not), and produce
>>>        recommendations using these item clusters (profiles). When a
>>>        new item
>>>        arrives, I find its nearest profile, and recommend the item
>>>        where its
>>>        belonging profile is recommended to. Do you find this approach
>>>        good enough?
>>>        If you have a theoretical idea, could you please point me to
>>>        some related
>>>        papers?
>>>        (As an MSc student, I can implement this as a Google Summer of
>>>        Code project,
>>>        with your mentoring.)
>>>        Thanks in advance

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message