mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] Resolved: (MAHOUT-423) Optimize getNumUsersWithPreferenceFor(long... itemIDs)
Date Tue, 22 Jun 2010 17:18:55 GMT


Sean Owen resolved MAHOUT-423.

    Resolution: Fixed

I committed a variant on this which further makes the two methods consistent in behavior and
implementation, and updates javadoc

> Optimize getNumUsersWithPreferenceFor(long... itemIDs)
> ------------------------------------------------------
>                 Key: MAHOUT-423
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jonathan Young
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.4
>         Attachments: MAHOUT-423.patch
> I ran a simple collaborative filtering application using a GenericBooleanPrefDataModel
built from (a subset of) the Netflix data, Tanimoto similarity, and the GenericItemBasedRecommender,
and then called recommender.mostSimilarItems() (a lot).  
> Profiling indicated that the majority of the time was spent in GenericBooleanPrefDataModel.getNumUsersWithPreferenceFor(long...
itemIDs).  The version in GenericDataModel is optimized for the cases of one and two itemIDs,
but the version in GenericBooleanPrefDataModel always computes the intersection set.
> I can create a patch which optimizes the two cases of itemIDs.length == 1 and itemIDs.length
== 2 (similar to the version in GenericDataModel), but perhaps the code should be refactored
if these are really the most common cases.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message