hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: regarding to sort and reducer
Date Wed, 01 Feb 2012 13:40:01 GMT
Samaneh,

Sorry for the late response. Inline, some of what I can offer.

On Sat, Jan 21, 2012 at 10:51 PM, Samaneh Shokuhi
<samaneh.shokuhi@gmail.com> wrote:
> Hi All,

Welcome!

> I am very new to hadoop and going to do some research on it regarding to my
> master thesis. First of all What i want to do is to know the fuctionality
> of sort  and shuffle and to run an applition while hadoop included  and not
> included sort part.
> I need to know which class in hadoop is taking care of sort ?

Are you looking for the sort mechanism or the algorithm?

This is an excellent presentation on the MR sort/shuffle/merge layers
that I recommend reading:
http://www.slideshare.net/hadoopusergroup/ordered-record-collection

> Another thing i need to know is the functianlity of reducer and to find out
> the possibility of sending message from one reducer to another one and
> doing kind of work stealing between reducers.

You probably want to read ReduceTask class, but this functionality is
not present today. Perhaps easier to do with the new MR2 framework,
detailed in http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce-nextgen-scheduler/

> Since i am very new to hadoop and it has alot of modules ,i need to know
> which project should i look at it.
> Also i ll appriciate you to let me know if you have any comment on this
> idea.

You need to look at the hadoop-mapreduce-project in trunk for all
things MR today. It also uses some generic components from the
hadoop-common project. See
http://wiki.apache.org/hadoop/HowToContribute for more details.

Please feel free to mail the lists with any specific questions you
have as you go ahead!

-- 
Harsh J
Customer Ops. Engineer
Cloudera | http://tiny.cloudera.com/about

Mime
View raw message