hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Goodhope <kengoodh...@gmail.com>
Subject Re: Intermediate files generated.
Date Fri, 02 Jul 2010 20:19:11 GMT
You could also use multi output from the old api.  This will allow you to
create multiple output collectors.  One collector could be used at
the beginning of the reduce call for writing the key-value pairing
unaltered, and another collector for writing the results of your processing.

On Fri, Jul 2, 2010 at 5:17 AM, Pramy Bhats <pramybhats@googlemail.com>wrote:

> Hi,
>
> Isn't possible to hack-in the intermediate files generated ?
>
> I am writing a compilation framework, so i dont want to mess up with
> existing programming framework. The upper layer or the programmer should
> write the program the way he should write, and I want to leverage the
> intermediate file generated for my analysis.
>
> thanks,
> --PB.
>
> On Fri, Jul 2, 2010 at 1:05 PM, Jones, Nick <nick.jones@amd.com> wrote:
>
> > Hi Pramy,
> > I would setup one M/R job to just map (setNumReducers=0) and chain
> another
> > job that uses a unity mapper to pass the intermediate data to the reduce
> > step.
> >
> > Nick
> > Sent by radiation.
> >
> > ----- Original Message -----
> > From: Pramy Bhats <pramybhats@googlemail.com>
> > To: common-user@hadoop.apache.org <common-user@hadoop.apache.org>
> > Sent: Fri Jul 02 01:05:25 2010
> > Subject: Re: Intermediate files generated.
> >
> > Hi Hemanth,
> >
> > I need to use the output of the mapper for some other application. As a
> > result, if I can redirect the output of the map in temp files of my
> choice
> > (which are stored on hdfs) then i can reuse the output later. At the same
> > time, the succeeding reducer can read the input from this temp files
> > without
> > any overhead.
> >
> > thanks,
> > --PB
> >
> > On Fri, Jul 2, 2010 at 3:52 AM, Hemanth Yamijala <yhemanth@gmail.com>
> > wrote:
> >
> > > Alex,
> > >
> > > > I don't think this is what I am looking for. Essential, I wish to run
> > > both
> > > > mapper as well as reducer. But at the same time, i wish to make sure
> > that
> > > > the temp files that are used between mappers and reducers are of my
> > > choice.
> > > > Here, the choice means that I can specify the files in HDFS that can
> be
> > > used
> > > > as temp files.
> > >
> > > Could you explain why you want to do this ?
> > >
> > > >
> > > > thanks,
> > > > --PB.
> > > >
> > > > On Fri, Jul 2, 2010 at 12:14 AM, Alex Loddengaard <alex@cloudera.com
> >
> > > wrote:
> > > >
> > > >> You could use the HDFS API from within your mapper, and run with 0
> > > >> reducers.
> > > >>
> > > >> Alex
> > > >>
> > > >> On Thu, Jul 1, 2010 at 3:07 PM, Pramy Bhats <
> > pramybhats@googlemail.com
> > > >> >wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > I am using hadoop framework for writing MapReduce jobs. I want
 to
> > > >> redirect
> > > >> > the output of Map into files of my choice and later use those
> files
> > as
> > > >> > input
> > > >> > for Reduce phase.
> > > >> >
> > > >> >
> > > >> > Could you please suggest, how to proceed for it ?
> > > >> >
> > > >> > thanks,
> > > >> > --PB.
> > > >> >
> > > >>
> > > >
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message