We got good clustering results from Implicit factorization using alpha =
1.0 since I thought to have a confidence of 1 + rating to observed entries
and 1 to unobserved entries. I used positivity / sparse coding basically to
force sparsity on document / topic matrix...But then I got confused because
I am modifying the real counts from dataset (does not matter much for in
practical sense since we really don't have true documents)
I mean gram matrix is the key here but then how much weight to give on real
counts also matters...I have not yet started looking into perplexity but
that will give me further insights...
On Sun, Jul 26, 2015 at 1:23 AM, Sean Owen <sowen@cloudera.com> wrote:
> It sounds like you're describing the explicit case, or any matrix
> decomposition. Are you sure that's best for countlike data? "It
> depends," but my experience is that the implicit formulation is
> better. In a way, the difference between 10,000 and 1,000 count is
> less significant than the difference between 1 and 10. However if your
> loss function penalizes the square of the error, then the former case
> not only matters more for the same relative error, it matters 10x more
> than the latter. It's very heavily skewed to pay attention to the
> highcount instances.
>
>
> On Sun, Jul 26, 2015 at 9:19 AM, Debasish Das <debasish.das83@gmail.com>
> wrote:
> > Yeah, I think the idea of confidence is a bit different than what I am
> > looking for using implicit factorization to do document clustering.
> >
> > I basically need (r_ij  w_ih_j)^2 for all observed ratings and (0 
> > w_ih_j)^2 for all the unobserved ratings...Think about the document x
> word
> > matrix where r_ij is the count that's observed, 0 are the word counts
> that
> > are not in particular document.
> >
> > The broadcasted value of gram matrix w_i'wi or h_j'h_j will also count
> the
> > r_ij those are observed...So I might be fine using the broadcasted gram
> > matrix and use the linear term as \sum (r_ijw_i) or \sum (rijh_j)...
> >
> > I will think further but in the current implicit formulation with
> > confidence, looks like I am really factorizing a 0/1 matrix with weights
> 1 +
> > alpha*rating for . It's a bit different from LSA model.
> >
> >
> >
> >
> >
> > On Sun, Jul 26, 2015 at 12:34 AM, Sean Owen <sowen@cloudera.com> wrote:
> >>
> >> confidence = 1 + alpha * rating here (so, c1 means confidence  1),
> >> so alpha = 1 doesn't specially mean high confidence. The loss function
> >> is computed over the whole input matrix, including all missing "0"
> >> entries. These have a minimal confidence of 1 according to this
> >> formula. alpha controls how much more confident you are in what the
> >> entries that do exist in the input mean. So alpha = 1 is lowish and
> >> means you don't think the existence of ratings means a lot more than
> >> their absence.
> >>
> >> I think the explicit case is similar, but not identical  here. The
> >> cost function for the explicit case is not the same, which is the more
> >> substantial difference between the two. There, ratings aren't inputs
> >> to a confidence value that becomes a weight in the loss function,
> >> during this factorization of a 0/1 matrix. Instead the rating matrix
> >> is the thing being factorized directly.
> >>
> >> On Sun, Jul 26, 2015 at 6:45 AM, Debasish Das <debasish.das83@gmail.com
> >
> >> wrote:
> >> > Hi,
> >> >
> >> > Implicit factorization is important for us since it drives
> >> > recommendation
> >> > when modeling user click/noclick and also topic modeling to handle 0
> >> > counts
> >> > in document x word matrices through NMF and Sparse Coding.
> >> >
> >> > I am a bit confused on this code:
> >> >
> >> > val c1 = alpha * math.abs(rating)
> >> > if (rating > 0) ls.add(srcFactor, (c1 + 1.0)/c1, c1)
> >> >
> >> > When the alpha = 1.0 (high confidence) and rating is > 0 (true for
> word
> >> > counts), why this formula does not become same as explicit formula:
> >> >
> >> > ls.add(srcFactor, rating, 1.0)
> >> >
> >> > For modeling document, I believe implicit Y'Y needs to stay but we
> need
> >> > explicit ls.add(srcFactor, rating, 1.0)
> >> >
> >> > I am understanding confidence code further. Please let me know if the
> >> > idea
> >> > of mapping implicit to handle 0 counts in document word matrix makes
> >> > sense.
> >> >
> >> > Thanks.
> >> > Deb
> >> >
> >
> >
>
