hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HADOOP-2960) A mapper should use some heuristics to decide whether to run the combiner during spills
Date Thu, 17 Jul 2014 21:36:06 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Allen Wittenauer resolved HADOOP-2960.
--------------------------------------

    Resolution: Won't Fix

Closing at won't fix, given the -1.

> A mapper should use some heuristics to decide whether to run the combiner during spills
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2960
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2960
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Runping Qi
>
> Right now, the combiner, if set, will be called for each spill, no mapper whether the
combiner can actually reduce the values.
> The mapper should use some heuristics to decide whether to run the combiner during spills.
> One of such heuristics is to check the the ratio of  the nymber of keys to the number
of unique keys in the spill.
> The combiner will be called only if that ration exceeds certain threshold (say 2).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message