Return-Path: X-Original-To: apmail-mahout-dev-archive@www.apache.org Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2CA1618DF0 for ; Sat, 13 Jun 2015 06:40:02 +0000 (UTC) Received: (qmail 14531 invoked by uid 500); 13 Jun 2015 06:40:01 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 14433 invoked by uid 500); 13 Jun 2015 06:40:01 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 14180 invoked by uid 99); 13 Jun 2015 06:40:00 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Jun 2015 06:40:00 +0000 Date: Sat, 13 Jun 2015 06:40:00 +0000 (UTC) From: "lariven (JIRA)" To: dev@mahout.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAHOUT-1739) maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAHOUT-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lariven updated MAHOUT-1739: ---------------------------- Fix Version/s: (was: 0.10.1) > maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct > ------------------------------------------------------------------------ > > Key: MAHOUT-1739 > URL: https://issues.apache.org/jira/browse/MAHOUT-1739 > Project: Mahout > Issue Type: Bug > Components: Collaborative Filtering > Affects Versions: 0.10.0 > Reporter: lariven > Labels: easyfix, patch > Attachments: fix_maxSimilarItemsPerItem_incorrect_behave.patch > > > the output similar items of ItemSimilarityJob for each target item may exceed the number of similar items we set to maxSimilarItemsPerItem parameter. the following code of ItemSimilarityJob.java about line NO. 200 may affect: > if (itemID < otherItemID) { > ctx.write(new EntityEntityWritable(itemID, otherItemID), new DoubleWritable(similarItem.getSimilarity())); > } else { > ctx.write(new EntityEntityWritable(otherItemID, itemID), new DoubleWritable(similarItem.getSimilarity())); > } > Don't know why need to switch itemID with otherItemID, but I think a single line is enough: > ctx.write(new EntityEntityWritable(itemID, otherItemID), new DoubleWritable(similarItem.getSimilarity())); -- This message was sent by Atlassian JIRA (v6.3.4#6332)