Return-Path: X-Original-To: apmail-pig-dev-archive@www.apache.org Delivered-To: apmail-pig-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5703DB31 for ; Mon, 27 Aug 2012 05:31:17 +0000 (UTC) Received: (qmail 88538 invoked by uid 500); 27 Aug 2012 05:31:17 -0000 Delivered-To: apmail-pig-dev-archive@pig.apache.org Received: (qmail 87600 invoked by uid 500); 27 Aug 2012 05:31:11 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 85311 invoked by uid 500); 27 Aug 2012 05:31:08 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 85212 invoked by uid 99); 27 Aug 2012 05:31:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 05:31:08 +0000 Date: Mon, 27 Aug 2012 16:31:08 +1100 (NCT) From: "Dmitriy V. Ryaboy (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: <706454205.587.1346045468221.JavaMail.jiratomcat@arcas> In-Reply-To: <1529296595.2889.1345673202721.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (PIG-2888) Improve performance of POPartialAgg MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PIG-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2888: ----------------------------------- Attachment: partialagg_patch_3.patch Minor logging and spill perf improvements (reusing the iterator, forcing an agg if any list gets too big, being slightly more clever about hashmap sizing). > Improve performance of POPartialAgg > ----------------------------------- > > Key: PIG-2888 > URL: https://issues.apache.org/jira/browse/PIG-2888 > Project: Pig > Issue Type: Improvement > Reporter: Dmitriy V. Ryaboy > Assignee: Dmitriy V. Ryaboy > Attachments: partialagg_patch_1.patch, partialagg_patch_2.patch, partialagg_patch_3.patch > > > During performance testing, we found that POPartialAgg can cause performance degradation for Pig jobs when the Algebraic UDFs it's being applied to aren't well suited to the operator's assumptions. Changing the implementation to a more flexible hash-based model can provide significant performance improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira