mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: co-occurrence paper and code
Date Mon, 11 Aug 2014 23:53:18 GMT
No, the question was why total number of trials sent to LLR is considered
to be m where M is m x n is a user/item matrix.

ok i got it. i made some incorrect assumption about previous code steps,
hence my inference derailed.




On Mon, Aug 11, 2014 at 4:07 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> If you binarize the original occurrence matrix then it seems to me that the
> values in the cooccurrence matrix *are* user counts.
>
> Perhaps I misunderstand your original question.
>
>
>
> On Mon, Aug 11, 2014 at 3:44 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>
> > sorry rather total occurrences of a pair should be sum(a_i) + sum(a_j) -
> > a_ij (not 1norm of course)
> >
> >
> > On Mon, Aug 11, 2014 at 3:11 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> > wrote:
> >
> > > Why coocurrence code takes number of users as total interactions?
> > > shouldn't that be 1-norm of the co-occurrence matrix?
> > >
> > >
> > > On Wed, Aug 6, 2014 at 4:21 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> > > wrote:
> > >
> > >> So, compared to original paper [1], similarity is now hardcoded and
> > >> always LLR? Do we have any plans to parameterize that further? Is
> there
> > any
> > >> reason to parameterize it?
> > >>
> > >>
> > >> Also, reading the paper, i am a bit wondering -- similarity and
> distance
> > >> are functions that usually are moving into different directions (i.e.
> > >> cosine similarity and angular distance) but in the paper distance
> scores
> > >> are also considered similarities? How's that?
> > >>
> > >> I suppose in that context LLR is considered a distance (higher scores
> > >> mean more `distant` items, co-occurring by chance only)?
> > >>
> > >> [1] http://ssc.io/wp-content/uploads/2012/06/rec11-schelter.pdf
> > >>
> > >> -d
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message