hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sudip Sinha <sudipsinha.ba...@gmail.com>
Subject Re: Required condition for the combiner to run
Date Fri, 20 Apr 2012 09:23:06 GMT
Code, input and output - http://pastebin.com/AG2DyZ22
userlogs - http://pastebin.com/bLv2Ad3J, http://pastebin.com/RzzEre1R


On Thu, Apr 19, 2012 at 7:04 PM, Harsh J <harsh@cloudera.com> wrote:

> Can you pastebin and provide your specific mapper userlog
> (syslogs/stderr/stdout)?
>
> On Thu, Apr 19, 2012 at 6:05 PM, Sudip Sinha <sudipsinha.bappa@gmail.com>
> wrote:
> > Hi,
> >
> > I'm reposting this as I've not received any reply to my earlier post on
> the
> > same issue.
> >
> > I've read that the combiner only works if it is specified AND the sort
> > memory buffer overflows in the mapper.
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201107.mbox/%3C374D8F3F-B8B1-499F-BEDB-BFEE3219010C@hortonworks.com%3E
> >
> > But when I run a Hadoop streaming job in R using RHadoop, the combiner
> > always runs when specified. This is on a very small dataset.
> >
> > Is this the desired behaviour?
> >
> > More on this: https://github.com/RevolutionAnalytics/RHadoop/issues/70
> >
> > Thanks,
> > Sudip Sinha
>
>
>
> --
> Harsh J
>

Mime
View raw message