mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAHOUT-407) Limit the number of similar items per item in the ItemSimilarityJob
Date Wed, 16 Jun 2010 17:46:24 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879415#action_12879415
] 

Sean Owen commented on MAHOUT-407:
----------------------------------

Looks fine in principle, but could I ask you to bring it up to date with head? not your fault,
should have reviewed and submitted it earlier, but it's conflicting with other recent changes.
I think you're in the best position to bring it up to date.

> Limit the number of similar items per item in the ItemSimilarityJob
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-407
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-407
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Sebastian Schelter
>         Attachments: MAHOUT-407.patch
>
>
> In order to keep the item-similarity-matrix sparse, it would be a useful improvement
to add an option like "maxSimilaritiesPerItem" to o.a.m.cf.taste.hadoop.similarity.item.ItemSimilarityJob,
which would make it try to cap the number of similar items per item.
> However as we store each similarity pair only once it could happen that there are more
than "maxSimilaritiesPerItem" similar items for a single item as we can't drop some of the
pairs because the other item in the pair might have too little similarities otherwise.
> A default value of 100 co-occurrences (similarities) will be used because this is already
the default in the distributed recommender.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message