hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Bean <jwfb...@cloudera.com>
Subject Re: When reduce tasks start in MapReduce Streaming?
Date Wed, 16 Jan 2013 09:20:26 GMT
It's called Hadoop Streaming because keys and values are streamed in to
stdin of the script you specify for Hadoop Streaming and then captured via
stdout.

On Wed, Jan 16, 2013 at 1:04 AM, Pedro Sá da Costa <psdc1978@gmail.com>wrote:

> So why it's called hadoop streaming, if it doesn't behave like a
> streaming application (The reduces don't receive data as long as it is
> produced by the map tasks)?
>
>
> On 16 January 2013 05:41, Jeff Bean <jwfbean@cloudera.com> wrote:
> > me property. The reduce method is not called until the mappers are done,
> and
> > the reducers are not scheduled before the threshold set by
> > mapred.reduce.slowstart.completed.maps is reached.
>
>
>
>
> --
> Best regards,
>

Mime
View raw message