pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haley Thrapp (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PIG-5356) Add keyword on reducing tasks to allow combiner hints
Date Thu, 30 Aug 2018 17:43:00 GMT
Haley Thrapp created PIG-5356:
---------------------------------

             Summary: Add keyword on reducing tasks to allow combiner hints 
                 Key: PIG-5356
                 URL: https://issues.apache.org/jira/browse/PIG-5356
             Project: Pig
          Issue Type: Wish
            Reporter: Haley Thrapp


Create a keyword that would allow the programmer to tell PIG that a particular DISTINCT/GROUP
BY action should disable the combiner.

 

Often we have pieces of code to ensure that data is meeting expectations, even if we are pretty
sure it already is. One example, is when we do a DISTINCT on data to ensure we do not have
duplicates or we re-GROUP data to a certain grain before a join. In these cases, the combiner
is taking extra time, but not actually giving us a benefit. We can gain significant performance
improvement in this cases if the combiner is simply not run. In some cases we can do this
at the job level, but in others, we may only want to have the combiner shut off for particular
statements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message