mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lariven (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAHOUT-1739) maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct
Date Sat, 13 Jun 2015 06:40:00 GMT

     [ https://issues.apache.org/jira/browse/MAHOUT-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

lariven updated MAHOUT-1739:
----------------------------
    Fix Version/s:     (was: 0.10.1)

> maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct
> ------------------------------------------------------------------------
>
>                 Key: MAHOUT-1739
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1739
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.10.0
>            Reporter: lariven
>              Labels: easyfix, patch
>         Attachments: fix_maxSimilarItemsPerItem_incorrect_behave.patch
>
>
> the output similar items of ItemSimilarityJob for each target item may exceed the number
of similar items we set to maxSimilarItemsPerItem  parameter. the following code of ItemSimilarityJob.java
about line NO. 200 may affect:
>         if (itemID < otherItemID) {
>           ctx.write(new EntityEntityWritable(itemID, otherItemID), new DoubleWritable(similarItem.getSimilarity()));
>         } else {
>           ctx.write(new EntityEntityWritable(otherItemID, itemID), new DoubleWritable(similarItem.getSimilarity()));
>         }
> Don't know why need to switch itemID with otherItemID, but I think a single line is enough:
>           ctx.write(new EntityEntityWritable(itemID, otherItemID), new DoubleWritable(similarItem.getSimilarity()));



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message