mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: Not all Mapper/Reducer slots are taken when running K-Means cluster
Date Sun, 11 Mar 2012 05:31:44 GMT
What's your Hadoop config in terms of the maximum number of reducers?
It's a function of your available RAM on each node and numbers of nodes.

On 3/10/12 8:55 PM, WangRamon wrote:
> Hi Paritosh    I did the tests with 1 job and 5 jobs, they all have the same problem,
the job i'm running is the buildClusters one, I can see there are 73 reduce tasks created
from the monitor GUI, but only 12 of them are running at any time (the rest are in pending
state), the task finished very quickly, it's about no more than 18 seconds to finish every
reduce task, so maybe that's the cause? Thanks    Cheers  Ramon
>  > Date: Sun, 11 Mar 2012 09:14:15 +0530
>> From: pranjan@xebia.com
>> To: user@mahout.apache.org
>> Subject: Re: Not all Mapper/Reducer slots are taken when running K-Means cluster
>>
>> And to answer the question about KMeans configuration :
>>
>> Kmeans has two jobs :
>> 1) builClusters : has a reducer and has no limitation on the number of
>> reducer tasks
>> 2) clusterData : executes if runClustering = true, has no reducer tasks
>>
>> On 11-03-2012 09:10, Paritosh Ranjan wrote:
>>> Can you run K-means jobs again ( all with the same block size ) and give
>>> same statistics for :
>>>
>>> a) only 1 job running
>>> b) 2 jobs running simultaneously
>>> c) 5 jobs running simultaneously
>>>
>>> On 10-03-2012 21:08, WangRamon wrote:
>>>>
>>>> Hi All  I submit 5  K-Means Jobs simultaneously, my Hadoop cluster have 42
map and 42 reduce slots configured, I set the default reduce task per job as 73 (42 * 1.75),
I find there are always about 12 of the reduce tasks are running at any time although there
are 73 reduce tasks created for each of the K-Means job and i do have 42 reduce slots, it
means at anytime i have about 30 reduce slots free. So i tried RecommenderJob from mahout
again, i remember that job will use all my slots in my previouse test, and YES for this time,
"RowSimilarityJob-CooccurrencesMapper-Reducer" do use all the slots 42 reduce and 42 map,
so I'm wondering is that something configured in Mahout which cause this strange behavior?
Any suggestions? Thanks in advance.   Btw, i'm using mahout-0.6 release. Cheers Ramon 		 
   		  
>  		 	   		  


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message