hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From brisk <mylinq...@gmail.com>
Subject Re: output/input ratio > 1 for map tasks?
Date Mon, 30 Jul 2012 20:33:43 GMT
Thanks, Niels.

So do you mean in this case, I could expect the map output size (in terms
of bytes) could be larger than the input size (e.g. by default 64MB)? I
will also do a test later...

Best,
Ethan

On Mon, Jul 30, 2012 at 1:15 PM, Niels Basjes <Niels@basjes.nl> wrote:

> Hi,
>
> On Mon, Jul 30, 2012 at 8:47 PM, brisk <mylinqiao@gmail.com> wrote:
> > Does anybody know if there are some cases where the output/input ratio
> for
> > map tasks is larger than 1? I can just think of for the sort, it's 1 and
> for
> > the search job it's usually smaller than 1...
>
> For a simple example: Have a look at the WordCount example.
>
> Input of a single map call is 1 record: "This is a line"
> Output are 4 records:
> This    1
> is       1
> a        1
> line     1
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>

Mime
View raw message