mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Fewer reducers leading to better distributed k-means job performance?
Date Tue, 06 Sep 2011 16:40:26 GMT
It could also have to do with context switch rate, memory set size or memory
bandwidth contention.  Having two many threads of certain kinds can cause
contention on all kinds of resources.

Without detailed internal diagnostics, these can be very hard to tease out.
 Looking at load average is a good first step.  If you can get to some
memory diagnostics about cache miss rates, you might be able to get further.

On Tue, Sep 6, 2011 at 10:50 AM, Timothy Potter <thelabdude@gmail.com>wrote:

> Hi,
>
> I'm running a distributed k-means clustering job in a 16 node EC2 cluster
> (xlarge instances). I've experimented with 3 reducers per node
> (mapred.reduce.tasks=48) and 2 reducers per node (mapred.reduce.tasks=32).
> In one of my runs (k=120) on 6m vectors with roughly 20k dimensions, I've
> seen a 25% improvement job performance using 2 reducers per node instead of
> 3 (~45 mins to do 10 iterations with 32 reducers vs. ~1 hour with 48
> reducers). The input data and initial clusters are the same in both cases.
>
> My sense was that maybe I was over-utilizing resources with 3 reducers per
> node, but in fact the load average remains healthy (< 4 on xlarge instances
> with 4 virtual cores) and does not swap or anything obvious like that. So
> the only other variable that I can think of here is the number and size of
> output files from one iteration being sent as input to the next iteration
> may have something to do with the performance difference? As further
> evidence to this hunch, running the job with vectors with 11k dimensions,
> the improvement was only about 10% -- so performance gains are better with
> more data.
>
> Does anyone else have any insights to what might be leading to this result?
>
> Cheers,
> Tim
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message