hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bejoy KS" <bejoy.had...@gmail.com>
Subject Re: fundamental doubt
Date Wed, 21 Nov 2012 20:03:42 GMT
Hi Jamal

It is performed at a frame work level map emits key value pairs and the framework collects
and groups all the values corresponding to a key from all the map tasks. Now the reducer takes
the input as a key and a collection of values only. The reduce method signature defines it.

Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: jamal sasha <jamalshasha@gmail.com>
Date: Wed, 21 Nov 2012 14:50:51 
To: user@hadoop.apache.org<user@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: fundamental doubt

I guess i am asking alot of fundamental questions but i thank you guys for
taking out time to explain my doubts.
So i am able to write map reduce jobs but here is my mydoubt
As of now i am writing mappers which emit key and a value
This key value is then captured at reducer end and then i process the key
and value there.
Let's say i want to calculate the average...
Key1 value1
Key2 value 2
Key 1 value 3

So the output is something like
Key1 average of value  1 and value 3
Key2 average 2 = value 2

Right now in reducer i have to create a dictionary with key as original
keys and value is a list.
Data = defaultdict(list) == // python usrr
But i thought that
Mapper takes in the key value pairs and outputs key: ( v1,v2....)and
Reducer takes in this key and list of values and returns
Key , new value..

So why is the input of reducer the simple output of mapper and not the list
of all the values to a particular key or did i  understood something.
Am i making any sense ??

View raw message