hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pramy Bhats <pramybh...@googlemail.com>
Subject Re: Intermediate files generated.
Date Thu, 08 Jul 2010 23:29:58 GMT
Correct me, If I am wrong. The output of the Mappers go to local file
system. And reducers, later fetches the output of Mappers.

If the above statements is correct, can we specify the file of our choice to
write the mappers out in desired location ?

thanks,
--Paul

On Fri, Jul 2, 2010 at 10:19 PM, Ken Goodhope <kengoodhope@gmail.com> wrote:

> You could also use multi output from the old api.  This will allow you to
> create multiple output collectors.  One collector could be used at
> the beginning of the reduce call for writing the key-value pairing
> unaltered, and another collector for writing the results of your
> processing.
>
> On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats <pramybhats@googlemail.com
> >wrote:
>
> > Hi,
> >
> > Isn't possible to hack-in the intermediate files generated ?
> >
> > I am writing a compilation framework, so i dont want to mess up with
> > existing programming framework. The upper layer or the programmer should
> > write the program the way he should write, and I want to leverage the
> > intermediate file generated for my analysis.
> >
> > thanks,
> > --PB.
> >
> > On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <nick.jones@amd.com> wrote:
> >
> > > Hi Pramy,
> > > I would setup one M/R job to just map (setNumReducers=0) and chain
> > another
> > > job that uses a unity mapper to pass the intermediate data to the
> reduce
> > > step.
> > >
> > > Nick
> > > Sent by radiation.
> > >
> > > ----- Original Message -----
> > > From: Pramy Bhats <pramybhats@googlemail.com>
> > > To: common-user@hadoop.apache.org <common-user@hadoop.apache.org>
> > > Sent: Fri Jul 02 01:05:25 2010
> > > Subject: Re: Intermediate files generated.
> > >
> > > Hi Hemanth,
> > >
> > > I need to use the output of the mapper for some other application. As a
> > > result, if I can redirect the output of the map in temp files of my
> > choice
> > > (which are stored on hdfs) then i can reuse the output later. At the
> same
> > > time, the succeeding reducer can read the input from this temp files
> > > without
> > > any overhead.
> > >
> > > thanks,
> > > --PB
> > >
> > > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <yhemanth@gmail.com>
> > > wrote:
> > >
> > > > Alex,
> > > >
> > > > > I don't think this is what I am looking for. Essential, I wish to
> run
> > > > both
> > > > > mapper as well as reducer. But at the same time, i wish to make
> sure
> > > that
> > > > > the temp files that are used between mappers and reducers are of
my
> > > > choice.
> > > > > Here, the choice means that I can specify the files in HDFS that
> can
> > be
> > > > used
> > > > > as temp files.
> > > >
> > > > Could you explain why you want to do this ?
> > > >
> > > > >
> > > > > thanks,
> > > > > --PB.
> > > > >
> > > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <
> alex@cloudera.com
> > >
> > > > wrote:
> > > > >
> > > > >> You could use the HDFS API from within your mapper, and run with
0
> > > > >> reducers.
> > > > >>
> > > > >> Alex
> > > > >>
> > > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <
> > > pramybhats@googlemail.com
> > > > >> >wrote:
> > > > >>
> > > > >> > Hi,
> > > > >> >
> > > > >> > I am using hadoop framework for writing MapReduce jobs.
I want
>  to
> > > > >> redirect
> > > > >> > the output of Map into files of my choice and later use
those
> > files
> > > as
> > > > >> > input
> > > > >> > for Reduce phase.
> > > > >> >
> > > > >> >
> > > > >> > Could you please suggest, how to proceed for it ?
> > > > >> >
> > > > >> > thanks,
> > > > >> > --PB.
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message