mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Newell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-667) Persistent storage of factorizations in SVDRecommender
Date Tue, 17 May 2011 10:59:47 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034705#comment-13034705
] 

Chris Newell commented on MAHOUT-667:
-------------------------------------

Found a bug in AbstractFactorizer, which I introduced after failing to understand how FastByIDMap
behaves.

These two methods:

{code} 
  protected Integer userIndex(long userID) {
    Integer userIndex = userIDMapping.get(userID);
    if (userIndex == null) {
      userIndex = userIDMapping.put(userID, userIDMapping.size());
    }
    return userIndex;
  }

  protected Integer itemIndex(long itemID) {
    Integer itemIndex = itemIDMapping.get(itemID);
    if (itemIndex == null) {
      itemIndex = itemIDMapping.put(itemID, itemIDMapping.size());
    }
    return itemIndex;
  }
{code} 

Should be replaced by:

{code}
  protected Integer getUserIndex(long userID) {
    Integer userIndex = userIDMapping.get(userID);
    if (userIndex == null) {
      userIndex = userIDMapping.size();
      userIDMapping.put(userID, userIndex);
    }
    return userIndex;
  }

  protected Integer getItemIndex(long itemID) {
    Integer itemIndex = itemIDMapping.get(itemID);
    if (itemIndex == null) {
      itemIndex = itemIDMapping.size();
      itemIDMapping.put(itemID, itemIndex);
    }
    return itemIndex;
  }
{code}

> Persistent storage of factorizations in SVDRecommender
> ------------------------------------------------------
>
>                 Key: MAHOUT-667
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-667
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5
>            Reporter: Chris Newell
>            Assignee: Sebastian Schelter
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: persistent_svd.patch, persistent_svd_v2.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> As discussed previously (https://issues.apache.org/jira/browse/MAHOUT-640) it would be
beneficial to provide a persistent storage mechanism for factorizations created by SVDRecommender
(in package org.apache.mahout.cf.taste.impl.recommender.svd) as these can be time consuming
to produce. It would also allow factorizations to be computed on one machine then distributed
to other machines providing predictions, improving efficiency and scalability.
> Having a "persistence strategy" interface has been suggested that could be implemented
as required. I'll try to post a outline proposal for discussion purposes in the next few days
but any comments or suggestions would be very welcome.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message