hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bonito <bonito.pe...@gmail.com>
Subject combine two map tasks
Date Sun, 28 Jun 2009 11:58:12 GMT

I am a new hadoop user and my question may sound naive..
However, I would like to ask if there is a way to combine the results of two
mpa tasks that may "run" simultaneously. 
I use the MultipleInput class and thus I have two different mappers. 
I want the result/output of the one map (associated with one input file) to
be used in the process of the second map (associated with the second input
I have thought of storing the map1 output in the hdfs and retrieving it
using the map2.
However, I have no clue whether this is possible. I mean...what about
time-executing issues? map2 has to wait until map1 is completed...

The thought of executing them in a serial manner is not the one I really

Any suggestion would be appreciated.
Thank you in advance :)

View this message in context: http://www.nabble.com/combine-two-map-tasks-tp24240928p24240928.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message