mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gmail <mansequ...@gmail.com>
Subject Re: kMeans Implementation
Date Tue, 04 Mar 2014 11:02:27 GMT
I used the kMeansDriver class, in clustering.kmeans package.
Yes I know that the use of MapReduce is mandatory, but I think that 
exists an easier implementation and especially mapreduce oriented.

Anyway, I thought it was a choice driven by performances.

Thank you.


On 03/04/2014 11:48 AM, Sean Owen wrote:
> Although I don't know exactly what you're referring to, in general,
> nothing about Map/Reduce means you always use a reducer. There are
> plenty of tasks that are much more appropriate as a map-only or
> reduce-only job. So this assertion doesn't fly to start with. But if
> you see two jobs that might be merged into one, that could be a useful
> suggestion.
>
> On Tue, Mar 4, 2014 at 10:43 AM, Gmail <mansequino@gmail.com> wrote:
>> Hello,
>> I was studying Mahout libraries and I found something of strange in your
>> kMeans implementation.
>>
>> I was looking inside it and I have noticed that kMeans only uses map
>> functions, omitting the reducers. Why have you done this choice?
>> It is not using MapReduce programming model even if it is declared that the
>> Mahout's core is Hadoop.
>> Is this choice driven by performance issue?
>>
>> Best regards
>> Manuel Sequino
>>
>>
> .
>


Mime
View raw message