hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jagaran das <jagaran_...@yahoo.co.in>
Subject Re: Hadoop project - help needed
Date Tue, 31 May 2011 17:40:06 GMT

To be very precise,
input to the mapper should be something you want to filter on basis of which you 
want to do the aggregation.
The Reducer is where you aggregate the output from mapper.

Check the WordCount Example in Hadoop, it can help you to understand the basic 


From: parismav <paok_gate_4_@hotmail.com>
To: core-user@hadoop.apache.org
Sent: Tue, 31 May, 2011 8:35:27 AM
Subject: Hadoop project - help needed

Hello dear forum, 
i am working on a project on apache Hadoop, i am totally new to this
software and i need some help understanding the basic features!

To sum up, for my project i have configured hadoop so that it runs 3
datanodes on one machine.
The project's main goal is, to use both Flickr API (flickr.com) libraries
and hadoop libraries on Java, so that each one of the 3 datanodes, chooses a
Flickr group and returns photos' info from that group.

In order to do that, i have 3 flickr accounts, each one with a different api

I dont need any help on the flickr side of the code, ofcourse. But what i
dont understand, is how to use the Mapper and Reducer part of the code. 
What input do i have to give the Map() function? 
do i have to contain this whole "info downloading" process in the map()

In a few words, how do i convert my code so that it runs distributedly on
thank u!
View this message in context: 
Sent from the Hadoop core-user mailing list archive at Nabble.com.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message