hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <omal...@apache.org>
Subject Re: Coordination between Mapper tasks
Date Thu, 19 Mar 2009 17:03:10 GMT

On Mar 18, 2009, at 10:26 AM, Stuart White wrote:

> I'd like to implement some coordination between Mapper tasks running
> on the same node.  I was thinking of using ZooKeeper to provide this
> coordination.

This is a very bad idea in the general case. It can be made to work,  
but you need to have a dedicated cluster so that you are sure they are  
all active at the same time. Otherwise, you have no guarantee that all  
of the maps are running at the same time.

In most cases, you are much better off using the standard  
communication between the maps and reduces and making multiple passes  
of jobs.

> I think I remember hearing that MapReduce and/or HDFS use ZooKeeper
> under-the-covers.

There are no immediate plans to implement HA yet.

-- Owen

View raw message