mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Adding weighting to boolean data
Date Thu, 22 Jul 2010 00:02:57 GMT
This is, roughly, a reasonable thing to do.

If you want to maintain the fiction of counts a little bit more closely, you
might consider just having counts decay over time and having short visits
only give partial credit.

On Wed, Jul 21, 2010 at 3:54 PM, Dave Williford <dave.williford@gmail.com>wrote:

> We are currently using LogLiklihoodSimilarity to create item
> recommendations
> based on page visits on our web site.  We would like to influence the
> generated recommendations for such factors as age of visit (weigh more
> recent visits more heavily), duration of page view (longer is better), same
> visit is better than cross-visit (things looked at on the same day are more
> related than items looked at by a given user across visits).
>
> I am considering introducing scores for each user/page data point.  This
> would essentially replace the integer calculations (which are based on
> summing total data points for each item, total items, and the intersection
> of item A with item B) with real numbers.  We could always round the sums
> to
> integers before sending through the loglikelihood calculation although I am
> not sure this is necessary.
>
> Note these score are not the same conceptually as preferences so I don't
> think switching to a preference based algorithm would give satisfactory
> results.
>
> I am very new to all of this and am wondering if I am completely off base
> or
> if this seems like a valid approach.  Any input is much appreciated.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message