Return-Path: X-Original-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 82F079ED1 for ; Wed, 1 Feb 2012 13:40:47 +0000 (UTC) Received: (qmail 95376 invoked by uid 500); 1 Feb 2012 13:40:46 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 95265 invoked by uid 500); 1 Feb 2012 13:40:46 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 95257 invoked by uid 99); 1 Feb 2012 13:40:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Feb 2012 13:40:45 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Feb 2012 13:40:41 +0000 Received: by pbcc11 with SMTP id c11so1367839pbc.35 for ; Wed, 01 Feb 2012 05:40:21 -0800 (PST) Received: by 10.68.74.167 with SMTP id u7mr59502877pbv.103.1328103621209; Wed, 01 Feb 2012 05:40:21 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.72.17 with HTTP; Wed, 1 Feb 2012 05:40:01 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Wed, 1 Feb 2012 19:10:01 +0530 Message-ID: Subject: Re: regarding to sort and reducer To: mapreduce-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Samaneh, Sorry for the late response. Inline, some of what I can offer. On Sat, Jan 21, 2012 at 10:51 PM, Samaneh Shokuhi wrote: > Hi All, Welcome! > I am very new to hadoop and going to do some research on it regarding to = my > master thesis. First of all What i want to do is to know the fuctionality > of sort =A0and shuffle and to run an applition while hadoop included =A0a= nd not > included sort part. > I need to know which class in hadoop is taking care of sort ? Are you looking for the sort mechanism or the algorithm? This is an excellent presentation on the MR sort/shuffle/merge layers that I recommend reading: http://www.slideshare.net/hadoopusergroup/ordered-record-collection > Another thing i need to know is the functianlity of reducer and to find o= ut > the possibility of sending message from one reducer to another one and > doing kind of work stealing between reducers. You probably want to read ReduceTask class, but this functionality is not present today. Perhaps easier to do with the new MR2 framework, detailed in http://developer.yahoo.com/blogs/hadoop/posts/2011/03/mapreduce= -nextgen-scheduler/ > Since i am very new to hadoop and it has alot of modules ,i need to know > which project should i look at it. > Also i ll appriciate you to let me know if you have any comment on this > idea. You need to look at the hadoop-mapreduce-project in trunk for all things MR today. It also uses some generic components from the hadoop-common project. See http://wiki.apache.org/hadoop/HowToContribute for more details. Please feel free to mail the lists with any specific questions you have as you go ahead! --=20 Harsh J Customer Ops. Engineer Cloudera | http://tiny.cloudera.com/about