hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Kulkarni <pavan.babu...@gmail.com>
Subject Where are the Map-output files produced ?
Date Tue, 17 Jul 2012 00:20:12 GMT

  I am trying to create  a hardlink between the files created after the Map
and the Reducer nodes which are behind Lustre. So basically the entire copy
phase during shuffle is eliminated.
 To create these hardlinks I need the exact fully qualified filenames of
the partitioned Map outputs and also the Path to the File on the Reducer
node where it is copied to.
  I am working on hadoop-1.0.2 version and the entire process happens in
the ReduceTask.java class.
I see that the files on Reduce node to which data is read into is
named as *output/map_1.out-2.
*Is this correct?
Also I couldn't find out the fully specified path of the files on Map-side
i.e the names of the partitioned Map-output files.
 Anyone has any idea how to find out the fully qualified pathnames of these
Any help is highly appreciated.Thanks


--With Regards
Pavan Kulkarni

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message