Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0EDE796C9 for ; Mon, 4 Jun 2012 12:31:40 +0000 (UTC) Received: (qmail 75872 invoked by uid 500); 4 Jun 2012 12:31:39 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 75336 invoked by uid 500); 4 Jun 2012 12:31:38 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 75284 invoked by uid 99); 4 Jun 2012 12:31:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jun 2012 12:31:37 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mysub987@gmail.com designates 209.85.217.176 as permitted sender) Received: from [209.85.217.176] (HELO mail-lb0-f176.google.com) (209.85.217.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jun 2012 12:31:30 +0000 Received: by lboj14 with SMTP id j14so4001521lbo.35 for ; Mon, 04 Jun 2012 05:31:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=4lREU9M70VywoxU3iT8bhpTVQ7iF49S/ET9z4ETpp9s=; b=FEAgtmpN27dFkAS13LKOcXYBZTNRFFnZVpuXfLwgOfuwRap9UKHF5YPww9ej2HntOL kG+iMvsZC1kRbpKD/YmOygMqCPkFPAIxQ6INoSBhd4gYeBR+SUyGKkRQxHSL6ylClzas IMfm688f2x3vK1RtUKJ4//pADtHaC0RP4Dxacwva/6BpaLT6DVt43Rgbl1HLg/TooYtC Sxy7q7wovvYA/ugMItScevXvJp4Jtj8y4WuG9hOMszalYoRrPrF0DuDxH7VDgbvjhXO2 cgKVs3hZDPQ5D/PEG8U2duw/LUgkoGB0u8qm08grqu99pX5XIfSwdKity0iDoeMuzfiw 1HYw== MIME-Version: 1.0 Received: by 10.152.106.12 with SMTP id gq12mr3916293lab.17.1338813069683; Mon, 04 Jun 2012 05:31:09 -0700 (PDT) Received: by 10.112.9.233 with HTTP; Mon, 4 Jun 2012 05:31:09 -0700 (PDT) In-Reply-To: References: Date: Mon, 4 Jun 2012 18:01:09 +0530 Message-ID: Subject: Re: Multi-GroupBy-Insert optimization From: shan s To: user@hive.apache.org Content-Type: multipart/alternative; boundary=f46d04088de3706a7104c1a4b5c2 --f46d04088de3706a7104c1a4b5c2 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Anyone? Thanks.. On Fri, Jun 1, 2012 at 5:25 PM, shan s wrote: > I am using Multi-GroupBy-Insert. I was expecting a single map-reduce job > which would club the group-bys together. > However it is scheduling n jobs where n =3D number of group bys.. > Could you please explain this behaviour. > > From X > INSERT OVERWRITE LOCAL DIRECTORY 'output/y1' > SELECT a, b , c, count(*) > group by a,b,c > INSERT OVERWRITE LOCAL DIRECTORY 'output/y2' > SELECT a , count(*) > group by a > INSERT OVERWRITE LOCAL DIRECTORY 'output/y3' > SELECT b, count(*) > group by b > =85.. > =85.. > =85=85 > --f46d04088de3706a7104c1a4b5c2 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
Anyone?
Thanks..
=A0
On Fri, Jun 1, 2012 at 5:25 PM, shan s <mysub9= 87@gmail.com> wrote:
I am using Multi-GroupBy-Insert. I was expecting a single map-reduce j= ob which would club the group-bys together.
However it is scheduling n = jobs where n =3D number of group bys..
Could you please explain this behaviour.
=A0
From X
INSERT OVERWRITE LOCAL DIRECTORY 'output/y1'
SELE= CT a, b , c, count(*)
group by a,b,c
INSERT OVERWRITE LOCAL DIRECTORY= 'output/y2'
SELECT=A0 a ,=A0 count(*)
group by a
INSERT = OVERWRITE LOCAL DIRECTORY 'output/y3'
SELECT b,=A0 count(*)
group by b
=85..
=85..
=85=85

--f46d04088de3706a7104c1a4b5c2--