hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-7) Optimize execution of algebraic functions
Date Mon, 03 Dec 2007 22:38:43 GMT

    [ https://issues.apache.org/jira/browse/PIG-7?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548028
] 

Alan Gates commented on PIG-7:
------------------------------

Checked in patch combiner3.patch.  I'm leaving the bug open for a couple of reasons.

1) We need to address the issues Utkarsh identified this comment: https://issues.apache.org/jira/browse/PIG-7#action_12548021
2) We need to expand the cases in which the combiner is used.  In particular, we need to deal
with the case where the group projection is somewhere other than the 0th position of the projection,
or where it is not projected at all.  This should not be too difficult, but we don't have
time to fix it now.

> Optimize execution of algebraic functions
> -----------------------------------------
>
>                 Key: PIG-7
>                 URL: https://issues.apache.org/jira/browse/PIG-7
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Olga Natkovich
>            Assignee: Alan Gates
>         Attachments: combiner.patch, combiner2.patch, combiner3.patch
>
>
> Algebraic are functions that can be computed incrementally like count(X), SUM(X), etc.
They can be computed effciently by doing the first level computation using hadoop combiner.
This can give a significant (2-3x) speedup for many aggregation queries. 
> Several users asked us for this feature so it is pretty high priority.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message