mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Federico Castanedo <>
Subject Re: Recommeding on Dynamic Content
Date Thu, 03 Feb 2011 09:55:39 GMT
Hi Dimitry,

I'm not sure if this algorithm:

could helps in the case of missing information in SGD, but it seems they
have a very efficient approach
in the case of unknown ratings in CF tasks using SVD.

2011/2/3 Dmitriy Lyubimov <>

> And i was referring to SVD recommender, not SGD here. SGD indeed takes
> care of that kind of problem since it doesn't examine "empty cells" in
> case of latent factor computation during solving factorization
> problems.
> But I think there's similar problem with missing side information
> labels in case of SGD: say we have a bunch of probes and we are
> reading signals off of them at certain intervals. but now and then we
> fail to read some of them. Actually, we fail pretty often. But regular
> SGD doesn't 'freeze' learning for inputs we failed to read off. We are
> forced to put some values there; and least harmless, it seems, is the
> average, since it doesn't cause any learning to happen on that
> particular input. But I think it does cause regularization to count a
> generation thus cancelling some of the learning. Whereas if we grouped
> missing inputs into separate learners and did hierarchical learning,
> that would not be happening. That's what i meant by SGD producing
> slightly more erorrs in this case compared to what  it seems to be
> possible to do with hierarchies.
> similarity between those cases (sparse SVD and SGD inputs) is that in
> every case we are forced to feed a 'made-up' data to learners, because
> we failed to observe it in a sample.
> On Wed, Feb 2, 2011 at 11:05 PM, Ted Dunning <>
> wrote:
> > That is a property of sparsity and connectedness, not SGD.
> >
> > On Wed, Feb 2, 2011 at 8:54 PM, Dmitriy Lyubimov <>
> wrote:
> >>
> >> As one guy from Stanford demonstrated on
> >> Netflix data, the whole system collapses very quickly after certain
> >> threshold of sample sparsity is reached.
> >
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message