mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] Commented: (MAHOUT-407) Limit the number of similar items per item in the ItemSimilarityJob
Date Wed, 16 Jun 2010 17:46:24 GMT


Sean Owen commented on MAHOUT-407:

Looks fine in principle, but could I ask you to bring it up to date with head? not your fault,
should have reviewed and submitted it earlier, but it's conflicting with other recent changes.
I think you're in the best position to bring it up to date.

> Limit the number of similar items per item in the ItemSimilarityJob
> -------------------------------------------------------------------
>                 Key: MAHOUT-407
>                 URL:
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Sebastian Schelter
>         Attachments: MAHOUT-407.patch
> In order to keep the item-similarity-matrix sparse, it would be a useful improvement
to add an option like "maxSimilaritiesPerItem" to,
which would make it try to cap the number of similar items per item.
> However as we store each similarity pair only once it could happen that there are more
than "maxSimilaritiesPerItem" similar items for a single item as we can't drop some of the
pairs because the other item in the pair might have too little similarities otherwise.
> A default value of 100 co-occurrences (similarities) will be used because this is already
the default in the distributed recommender.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message