hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Kulkarni <pavan.babu...@gmail.com>
Subject Where are the Map-output files produced ?
Date Tue, 17 Jul 2012 00:20:12 GMT
Hi,

  I am trying to create  a hardlink between the files created after the Map
phase
and the Reducer nodes which are behind Lustre. So basically the entire copy
phase during shuffle is eliminated.
 To create these hardlinks I need the exact fully qualified filenames of
the partitioned Map outputs and also the Path to the File on the Reducer
node where it is copied to.
  I am working on hadoop-1.0.2 version and the entire process happens in
the ReduceTask.java class.
I see that the files on Reduce node to which data is read into is
named as *output/map_1.out-2.
*Is this correct?
Also I couldn't find out the fully specified path of the files on Map-side
i.e the names of the partitioned Map-output files.
 Anyone has any idea how to find out the fully qualified pathnames of these
files?
Any help is highly appreciated.Thanks

-- 

--With Regards
Pavan Kulkarni

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message