Return-Path: Delivered-To: apmail-incubator-pig-dev-archive@locus.apache.org Received: (qmail 57747 invoked from network); 29 Nov 2007 21:33:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Nov 2007 21:33:28 -0000 Received: (qmail 75512 invoked by uid 500); 29 Nov 2007 21:33:16 -0000 Delivered-To: apmail-incubator-pig-dev-archive@incubator.apache.org Received: (qmail 75466 invoked by uid 500); 29 Nov 2007 21:33:16 -0000 Mailing-List: contact pig-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: pig-dev@incubator.apache.org Delivered-To: mailing list pig-dev@incubator.apache.org Received: (qmail 75457 invoked by uid 99); 29 Nov 2007 21:33:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Nov 2007 13:33:16 -0800 X-ASF-Spam-Status: No, hits=3.2 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [207.126.228.150] (HELO rsmtp2.corp.yahoo.com) (207.126.228.150) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Nov 2007 21:32:54 +0000 Received: from [172.21.36.237] (pleasebread-lm.corp.yahoo.com [172.21.36.237]) by rsmtp2.corp.yahoo.com (8.13.8/8.13.8/y.rout) with ESMTP id lATLWgSS028018 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Thu, 29 Nov 2007 13:32:43 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:content-type:message-id:from: subject:date:to:x-mailer; b=npCQ3mqDqEDqlKBpVF0HcCt8Fuu6l9R/NZ4oHKTk0Xrv+6bsGIb/3HGQV2UeZQQR Mime-Version: 1.0 (Apple Message framework v752.3) In-Reply-To: <10647795.1196367943134.JavaMail.jira@brutus> References: <10647795.1196367943134.JavaMail.jira@brutus> Content-Type: multipart/alternative; boundary=Apple-Mail-10--978570010 Message-Id: <3D784BEE-780F-4581-A522-BA9B8E08BE8C@yahoo-inc.com> From: Chris Olston Subject: Re: [jira] Updated: (PIG-7) Optimize execution of algebraic functions Date: Thu, 29 Nov 2007 13:32:41 -0800 To: pig-dev@incubator.apache.org X-Mailer: Apple Mail (2.752.3) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-10--978570010 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Awesome!! On Nov 29, 2007, at 12:25 PM, Alan Gates (JIRA) wrote: > > [ https://issues.apache.org/jira/browse/PIG-7? > page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] > > Alan Gates updated PIG-7: > ------------------------- > > Patch Info: [Patch Available] > > Attaching patch that implements use of combiner for algebraic > functions in limited situations. Algebraic is only applied when > all functions to be evaluated in a given generate line are > algebraic and when there is one and only one relation being grouped > (ie it is not applied in cogroup situations). > > Initial, very simple, performance tests show a speed up of ~40% > (13m -> 7.5m for 4G on 10 machines) with the following script: > a = load '/user/pig/tests/data/perf/studenttab200M'; > b = group a by $0; > c = foreach b generate group, COUNT($1), SUM($1.$2), AVG($1.$2), MIN > ($1.$1), MAX($1.$2); > store c into 'bla'; > >> Optimize execution of algebraic functions >> ----------------------------------------- >> >> Key: PIG-7 >> URL: https://issues.apache.org/jira/browse/PIG-7 >> Project: Pig >> Issue Type: Improvement >> Components: impl >> Reporter: Olga Natkovich >> Assignee: Alan Gates >> Attachments: combiner.patch >> >> >> Algebraic are functions that can be computed incrementally like >> count(X), SUM(X), etc. They can be computed effciently by doing >> the first level computation using hadoop combiner. This can give a >> significant (2-3x) speedup for many aggregation queries. >> Several users asked us for this feature so it is pretty high >> priority. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > -- Christopher Olston, Ph.D. Sr. Research Scientist Yahoo! Research --Apple-Mail-10--978570010--