hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Y G <gymi...@gmail.com>
Subject Re: How to call method after all map jobs on slaves nodes are done
Date Sat, 14 Nov 2009 02:43:19 GMT
is your job a single map job without any reduce? if it is ,i think you
could set the num of reduce to 0  then the map intermediate data will
directly output to hdfs from local.

2009/11/14, Hrishikesh Agashe <hrishikesh_agashe@persistent.co.in>:
> Hi,
> I am implementing the MapRunnable interface to create the Map jobs.
> I have large data set for processing. (Data size is around 10 GB).
> I have 1 master and 10 slaves cluster.
> When I run my program, hadoop will process data successfully.
> After processing, I am collecting all data (all are files) in hadoop
> temporary directory.
> Now my requirement is when all maps are completed on each node I want to
> call one method which will process the data from temporary directory and
> finally copy those files on HDFS.
> Is there any way to do this?
> --Hrishi
> ==========
> This e-mail may contain privileged and confidential information which is the
> property of Persistent Systems Ltd. It is intended only for the use of the
> individual or entity to which it is addressed. If you are not the intended
> recipient, you are not authorized to read, retain, copy, print, distribute
> or use this message. If you have received this communication in error,
> please notify the sender and delete all copies of this message. Persistent
> Systems Ltd. does not accept any liability for virus infected mails.



View raw message