hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Barry, Sean F" <sean.f.ba...@intel.com>
Subject Shuffle/sort
Date Tue, 05 Jun 2012 22:46:10 GMT
"I was always wondering after mapping, how each reduce task get its input. It is said in
google's paper and hadoop's documentation that a sort is done to aggregate the
same key of the map output. But there is no detailed explanation of how it is
implemented and my intuition is that perhaps a global hashing will work better
than sorting. So I really want to know the details and see whether my intuition
is right. If I can find out that in the source code, where should I start with?"

I saw this question online and no one replied to it. does anyone know where I go to study
the source code for the shuffle and sort.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message