hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "NewsPersonalizationSystem" by udanax
Date Fri, 11 Jan 2008 15:52:34 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/NewsPersonalizationSystem

------------------------------------------------------------------------------
  == Initial Contributors ==
  
   * [:udanax:Edward Yoon] (R&D center, NHN corp.)
- == Algorithm Overview ==
+ = Algorithm Overview =
  
   * Obtain a list of candidate stories
   * For each story:
@@ -23, +23 @@

    * May not be clustered
    * Rely on co-visitation to generate recommendations
  ----
- == NPS Architecture ==
+ = NPS Architecture =
  
  {{{
                                               +----------------------------+
@@ -49, +49 @@

  +-------------------------------------------------------------------------+
  }}}
  ----
- == Clustering Algorithms ==
+ = Clustering Algorithms =
- === User clustering - MinHash ===
+ == User clustering - MinHash ==
  
   * Input: User and his clicked stories
    * ~+S+~,,u,, = {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m}
   * User similarity = | S,,u1,, I S,,u2 | / |S,,u1, Y S,,u2,, |
   * Output: User clusters. 
   * Similar users belong to same cluster
- ==== MinHash ====
+ === MinHash ===
  
   * Randomly permute the universe of clicked stories
    * {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m} = {s^'^,,1,, , s^'^,,2,, , ... , s^'^,,m}
@@ -68, +68 @@

   * Treat !MinHash value as !ClusterId
   * Probabilistic clustering
  
- === Clustering - PLSI Algorithm ===
+ == Clustering - PLSI Algorithm ==
  
   * Learning (done offline)
    * ML estimation 
@@ -77, +77 @@

    * P[zj|u]’s lead to a soft clustering of users
   * Runtime: we only use P[zj|u]’s  and ignore P[s|zj]’s
  
- === Covisitation count ===
+ == Covisitation count ==
  
   * For each story si store the covisitation counts with other stories c(si, sj )
   * Candidate story: sk
@@ -85, +85 @@

   * score (si, sj ) = c(si, sj )/∑m c(si, sm )
   * total_score(sk) = ∑n score(sk, sn )
  ----
- == References ==
+ = References =
   * Google News Personalization: Scalable Online Collaborative Filtering
   * Bigtable paper: OSDI 2006
  

Mime
View raw message