mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serega Sheypak <serega.shey...@gmail.com>
Subject itemsimilarity returns few similar items
Date Thu, 24 Jul 2014 06:57:46 GMT
Hi, I'm trying to calc itemsimilarity using ItemSimilarityJob.
Here are my counts:
input dataset: user_id, item_id, pref: 16M
distinct items: 700K
distinct users: 4M

bucketed preferences per users
count_of_preferences, count_of_users
1                                   2M
2                                   600K
3                                   300K
4                                   300R
......

threshold: 0.91
similarityClassname=PEARSON

It returns ~2000 rows for ~1000 distinct items.
What do i do wrong?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message