Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 31A20F390 for ; Thu, 2 May 2013 06:55:22 +0000 (UTC) Received: (qmail 12256 invoked by uid 500); 2 May 2013 06:55:21 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 11620 invoked by uid 500); 2 May 2013 06:55:17 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 11577 invoked by uid 99); 2 May 2013 06:55:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 May 2013 06:55:16 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nitinpawar432@gmail.com designates 209.85.215.47 as permitted sender) Received: from [209.85.215.47] (HELO mail-la0-f47.google.com) (209.85.215.47) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 May 2013 06:55:11 +0000 Received: by mail-la0-f47.google.com with SMTP id fh20so220113lab.34 for ; Wed, 01 May 2013 23:54:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=1GGEKC1kcXgzMwnw8XoaWkgYrs6X948Hi2D9hdoyVg8=; b=YdpfK1cK6YGfTLrOJVQxnCyl66VsMn3P0F6xlp3Y9aMo/qBd3Gphle4+H8oTB3Gq8y PZLzYesdxIS8atrilWZW2u2m30T5X0evipg3LJ6tOJ57dAUpyZG24FrFw+6Njc4NLXtG i8ZBS0gguWZv7mur1uIoiTvlI79qj7neN7oNEOf+41QRaqg8OfvX9DNRJ4whiqnpwI8n 31sMeHkVClEwyqEB+VW6OCxWpDx6daHWEnhIAJZUhENqoXqGeg8K8no1HCpI9z6WPfML z4ywEviP5ALpu0GECaiaez1w1257hMrEBjIwZs3SNSbyWoolFOcQ7iiQez6c4AQMzRfk EAZA== MIME-Version: 1.0 X-Received: by 10.112.160.226 with SMTP id xn2mr2162559lbb.16.1367477690118; Wed, 01 May 2013 23:54:50 -0700 (PDT) Received: by 10.114.24.129 with HTTP; Wed, 1 May 2013 23:54:50 -0700 (PDT) In-Reply-To: References: Date: Thu, 2 May 2013 12:24:50 +0530 Message-ID: Subject: Re: Can a bucket be added to a partition? From: Nitin Pawar To: user@hive.apache.org Content-Type: multipart/alternative; boundary=001a11c37c44f5402d04dbb6b563 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c37c44f5402d04dbb6b563 Content-Type: text/plain; charset=ISO-8859-1 you can add the buckets to a paritions no problems with that. But to have a bucketed map join what you need is, both the tables need to bucketed and they need to be in the multiplication factor of each other like if you have X number of buckets on table A then table B will need NX number of partitions where N >= 1 there is no condition on partition keys for join condition. Hive only supports equi joins so its always good idea to have table partitioned on same column so that you don't have to scan the entire table to match the column values and you can restrict the data to table in where condition On Thu, May 2, 2013 at 10:08 AM, Jie Li wrote: > I tried this interesting idea but also felt a little confusing. > > I guess you'll need to change the table schema so that it has both buckets > and partitions. > > And to take advantage of the buckets inside the partitions, for example > using the bucket map join, you'll need to specify one particular partition > of the table. Seems HIVE-3171 has fixed this problem, but I'm still not > very clear how two partitioned tables can be joined using bucket map join? > Do they need the same partition keys and bucket keys, and then Hive will do > partition-wise join as well as bucket-wise join? > > Jie > > > On Tue, Apr 30, 2013 at 12:03 PM, Babe Ruth wrote: > >> Hello, >> I have a table that is already created and is partitioned dynamically by >> day. i would like all future partitions to be bucketed on two columns. >> >> Can I add a bucket to a partitions in an already existing table? >> >> >> >> Thanks, >> George >> > > -- Nitin Pawar --001a11c37c44f5402d04dbb6b563 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
you can add the buckets to a paritions no problems with th= at.=A0

But to have a bucketed map join what you ne= ed is, both the tables need to bucketed and they need to be in the multipli= cation factor of each other like if you have X number of buckets on table A= then table B will need NX number of partitions where N >=3D 1

there is no condition on partition keys for= join condition. Hive only supports equi joins so its always good idea to h= ave table partitioned on same column so that you don't have to scan the= entire table to match the column values and you can restrict the data to t= able in where condition=A0


On Thu,= May 2, 2013 at 10:08 AM, Jie Li <jieli@cs.duke.edu> wrote:<= br>
I tried this interesting idea but also felt a little = confusing.=A0

I guess you'll need to change the tab= le schema so that it has both buckets and partitions.=A0

And to take advantage of the buckets inside the partitions, for example usi= ng the bucket map join, you'll need to specify one particular partition= of the table. Seems=A0HIVE-3171 has fixed this problem, but I'm still = not very clear how two partitioned tables can be joined using bucket map jo= in? Do they need the same partition keys and bucket keys, and then Hive wil= l do partition-wise join as well as bucket-wise join?

Jie


On Tue, Apr 30, 2013 at 12:03 PM, Babe Ruth <gtevelde-hive@hotma= il.com> wrote:
Hello,
=A0I have a table = that is already created and is partitioned dynamically by day. =A0 =A0 i wo= uld like all future partitions to be bucketed on two columns.=

Can I add a bucket to a partition= s in an already existing table?



Thanks,
George=




--
= Nitin Pawar
--001a11c37c44f5402d04dbb6b563--