mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 许春玲 <>
Subject 40 hours to run 1/2 Netflix Data?
Date Mon, 14 May 2012 01:44:06 GMT

   I run item recommemder base on Netflix, but it always fail for not
enough local disk space. So, I cut the User Id to half(not user account but user Id),to reduce
the temp data. Now, it finish but 
take 40 hours. The command like follow:

hadoop jar /app/mahout-distribution-0.5/core/target/mahout-core-0.5-job.jar -Dmapred.reduce.tasks=196 -Dmapred.input.dir=NetFlix_data_new -Dmapred.output.dir=output_netflix8

my hadoop cluster:

28 nodes
16G memory per node
8 core per node
250G local disk per node

View raw message