hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Sautins" <andy.saut...@returnpath.net>
Subject Can mapper get access to filename being processed?
Date Sun, 07 Dec 2008 18:02:08 GMT

   I'm having trouble finding a way to do what I want, so I'm wondering
if I'm just not looking at the right place or if I'm thinking about the
problem in the wrong way.  Any insight would be appreciated.


   Let's say I have a directory of files that contains a combination of
different file types.  The MapReduce job needs to process all files in
the directory but generates different key/value pairs depending on the
file being processed.  What I'd like to do is use the filename to
identify the file type being processed and use that information in the
map job.  What it seems like what I'd want is the map job to have access
to the filename of the input file split being processed.  I haven't been
able to find out if that is available to a derived class of


   Does what I'm trying to do make sense or is there a better way of
processing a job like the one I'm describing?


   Thank you






  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message