mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Run ItemSimilarityJob Problem
Date Tue, 14 Apr 2015 18:59:57 GMT
use 

“mahout itemsimilarity …”

But be aware that you have to convert all your user and item ids into non-negative ints. Basically
inside Mahout-MapReduce they are assumed to be row and column numbers in a big matrix of all
input. 

BTW no need to move data, Mahout-Spark reads anything Mahout-MapReduce can read without the
ID restrictions.

On Apr 12, 2015, at 8:04 PM, lastarsenal <lastarsenal@163.com> wrote:

Hi, Pat,
   I think it would better to follow the existing system instead of making a large scale data
transfer. 


  So, I will be very appreciated if somebody can give the advice based on hadoop, Thank you.





在 2015-04-13 00:33:48,"Pat Ferrel" <pat@occamsmachete.com> 写道:
> You are invoking it incorrectly but I’d suggest using the newer Spark version. It’s
easier to use and about 10x faster.
> 
> You’ll need to install Spark alongside Mahout then invoke with:
> 
> mahout spark-itemsimilarity -i input -o output ….
> 
> The driver is documented here: http://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html
> 
> 
> On Apr 11, 2015, at 12:34 AM, lastarsenal <lastarsenal@163.com> wrote:
> 
> Hi,
> 
> I'm a rookie for mahout. Recently when I tried to run ItemSimilarityJob with my own hadoop,
I met a problem. The command is:
> 
> hadoop jar mahout-core-0.9-job.jar org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
-i /home/hadoop/itembased/user_item -o /home/hadoop/itembased/output -s SIMILARITY_EUCLIDEAN_DISTANCE
-mp 0 -b true --startPhase 0 --endPhase 0
> 
> 
> There are 1 errors:
> 15/04/10 15:06:02 ERROR common.AbstractJob: Unexpected 0 while processing Job-Specific
Options:
> Unexpected 0 while processing Job-Specific Options:                             
> Usage:                                                                          
> [--input <input> --output <output> --similarityClassname <similarityClassname>

> --maxSimilaritiesPerItem <maxSimilaritiesPerItem> --maxPrefs <maxPrefs> 
       
> --minPrefsPerUser <minPrefsPerUser> --booleanData <booleanData> --threshold
    
> <threshold> --randomSeed <randomSeed> --help --tempDir <tempDir> --startPhase
  
> <startPhase> --endPhase <endPhase>]    
> 
> 
> What's the resaon for this situation? Thank you!
> 
> 
> Best Regards,
> lastarsenal
> 


Mime
View raw message