hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Required condition for the combiner to run
Date Thu, 19 Apr 2012 13:34:58 GMT
Can you pastebin and provide your specific mapper userlog
(syslogs/stderr/stdout)?

On Thu, Apr 19, 2012 at 6:05 PM, Sudip Sinha <sudipsinha.bappa@gmail.com> wrote:
> Hi,
>
> I'm reposting this as I've not received any reply to my earlier post on the
> same issue.
>
> I've read that the combiner only works if it is specified AND the sort
> memory buffer overflows in the mapper.
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201107.mbox/%3C374D8F3F-B8B1-499F-BEDB-BFEE3219010C@hortonworks.com%3E
>
> But when I run a Hadoop streaming job in R using RHadoop, the combiner
> always runs when specified. This is on a very small dataset.
>
> Is this the desired behaviour?
>
> More on this: https://github.com/RevolutionAnalytics/RHadoop/issues/70
>
> Thanks,
> Sudip Sinha



-- 
Harsh J

Mime
View raw message