mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 胡仲义 <sunnydd...@gmail.com>
Subject The performance of mahout's recommender.
Date Sun, 30 Sep 2012 12:07:11 GMT
*Hi, I am a mahout user and I am confused by the performance of mahout's
recommender.*
*
*
*I have a prefrence data set of an e-commerce platform, and each line of
the data file represents a single prefrence in the form of
userID,itemID,rating value. The input is 7.8GB as a text file, and contains
3,70,250,381 lines of user-item-prefrence associations, from 1,32,598,906
user to 35,920,654 distinct items. I use mahout to recommend 10 items for
each user with **org.apache.mahout.cf.taste.hadoop.item.RecommenderJob on
hadoop clusters with 250 Linux servers. The command is as follow:*
*
*
*$./mahout org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -i
input/input.txt -o  output -s SIMILARITY_LOGLIKELIHOOD --usersFile
input/users.txt --numRecommendations 10   --tempDir temp


                   *
*
*
*However, the performance let me down, it took 23 hours to get the result.
 I want to know is it normal or there are some methods can improve the
performance.*
*
*
*thanks.*
*
*
*--Hu Zhy*
*
*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message