hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: [RT] map reduce "pipelines"
Date Wed, 09 Jun 2010 17:52:39 GMT
On chaining, please refer to
http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/ChainReducer.html
Also check out ChainMapper

See also
http://www.mail-archive.com/common-user@hadoop.apache.org/msg00541.html

On Tue, Jun 8, 2010 at 6:02 PM, Torsten Curdt <tcurdt@apache.org> wrote:

> At Cocoon we have a construct that we called pipelines.
> And frankly speaking I am currently missing something similar in hadoop.
>
> It would be so great if the API was to allow thing like this
>
> M -> M -> R
>
> M -> R -> R
>
> M --> R
>    \-> R
>
> M --> R
> M -/
>
> Of course supporting also multiple inputs and outputs.
>
> The current hadoop processing model feels so overly restrictive too me.
> But it could just be me not knowing better.
>
> Any comments?
>
> cheers
> --
> Torsten
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message