hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gang Luo <lgpub...@yahoo.com.cn>
Subject Re: Some information on Hadoop Sort
Date Fri, 19 Feb 2010 22:06:15 GMT
the sorting is done by the MapReduce framework. At map side, the output record will first
go to a sorting buffer where the sorting, partitioning and combining (if there is combiner)
happen. If necessary, multi-phase sorting is done to make a single sorted result for each
map task. At reduce side, all the data from multiple map tasks will be merged (each of them
is sorted at the map side, you only need merge sort here). It goes multiple rounds if necessary.


----- 原始邮件 ----
发件人: "aa225@buffalo.edu" <aa225@buffalo.edu>
收件人: common-user@hadoop.apache.org
发送日期: 2010/2/19 (周五) 2:25:50 下午
主   题: Some information on Hadoop Sort

      I was wondering if some one could me some information on hadoop does the
sorting. From what I have read there does not seem to be a map class and reduce
class ? Where and how is the sorting parallelized ?

Best Regards from Buffalo

Abhishek Agrawal

SUNY- Buffalo


View raw message