hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fatih Haltas <fatih.hal...@nyu.edu>
Subject Re: 2 Reduce method in one Job
Date Sun, 24 Mar 2013 13:53:56 GMT
Thank you very much.

You are right Harsh, it is exactly what i am trying to do.

I want to process my result, according to the keys and i donot spend time
writing this data to hdfs, I want to pass data as input to another reduce.

One more question then,
Creating 2 diffirent job, secondone has only reduce for example, is it
possible to pass first jobs output as argument to second job?


On Sun, Mar 24, 2013 at 5:44 PM, Harsh J <harsh@cloudera.com> wrote:

> You seem to want to re-sort/partition your data without materializing
> it onto HDFS.
>
> Azuryy is right: There isn't a way right now and a second job (with an
> identity mapper) is necessary. With YARN this is more possible to
> implement into the project, though.
>
> The newly inducted incubator project Tez sorta targets this. Its in
> its nascent stages though (for general user use), and the website
> should hopefully appear at
> http://incubator.apache.org/projects/tez.html soon. Meanwhile, you can
> read the proposal behind this project at
> http://wiki.apache.org/incubator/TezProposal. Initial sources are at
> https://svn.apache.org/repos/asf/incubator/tez/trunk/.
>
> On Sun, Mar 24, 2013 at 6:33 PM, Fatih Haltas <fatih.haltas@nyu.edu>
> wrote:
> > I want to get reduce output as key and value then I want to pass them to
> a
> > new reduce as input key and input value.
> >
> > So is there any Map-Reduce-Reduce kind of method?
> >
> > Thanks to all.
>
>
>
> --
> Harsh J
>

Mime
View raw message