Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DD1997784 for ; Thu, 15 Dec 2011 19:02:13 +0000 (UTC) Received: (qmail 1083 invoked by uid 500); 15 Dec 2011 19:02:13 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 988 invoked by uid 500); 15 Dec 2011 19:02:12 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Delivered-To: moderator for dev@hive.apache.org Received: (qmail 10217 invoked by uid 99); 15 Dec 2011 12:14:13 -0000 X-ASF-Spam-Status: No, hits=4.7 required=5.0 tests=FREEMAIL_FORGED_REPLYTO,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 862719.70305.bm@omp1051.mail.ne1.yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1323951222; bh=lsUDnxMJc7bqH6uC10Br00yor1lx8DArozf+mNWXL7I=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=XDUlcQyT5uY/AOcQXt74DMcv7udKXaOeH1HUkJK6k577lzM3xJaFrKWe/SWGPwxPEsZo/ObuD4U3dzSROFBHFJX1gFdYbXXZIiO8FIKriO8Bg8H4YVkvePnCXLYKzpsAafCCf0TpDQD3HEfOBAqFmEiz7JiJKkweKvJTNC0zXAo= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=nT9wTZ1UdGX/yOpq8eMcQ0eYEPOpBzd8Ek35T6miqGd2kKq5KPj5QFB0iMUN6GzcAx7RynBlG2np/arr4Y1ThByjkuf5DVTQeX4Tqh07Rb1xr76ILjV9ySjNWzBV8alHRJcILq7F3WDA2TSlQltxJ+k9lqk3RAba1ztBCl89IVM=; X-YMail-OSG: xgX9fPQVM1mX8WSAW1d3dmVLMfm46lGGwcH6GVn_9Ce3FB9 ZvQi09Lt.saabjqdaxuufipmsWYikLUGDLKG2zRdG5FmOz3dY3ELT0WuqfLF OwoLHwz92ZWCFODyFeha2uRVqJmyDylXsq0NRC9iprhfbBzwqlDzqFwIMQGp 23BxkJL4Kc9Gc31ruWNbL__oSmv91TSR.hgG9EH2B9bTvMpyhDOyuqhpnQFA _Q0DezgcBC3YGwL7r7MkSGG09KDSF4WinJiqOT0hkMH4_ctKxwYup5Ruo7cb IjH_5CshTiZGijKHITbHcss.EQVC3lVZaIo10ab2.RFfcPYT5qAC6SIhDfY5 1SYNZrHM7yNd6L22tutZH7WDNWmHgSNSpOtUr4ADnd0PLT2Lr1KxINTMM2zY 388.Oy3gfDvbs6NbgFy8uWBNwd0X1C3B8QEcFCoNvwTPC4SdRcA.uhVwpqQ_ e4FjvbdMPDvFRnrzWzf0dZwVZauKuO5sQZgpOX9ij5WZa2P0s2jbokOSgT9N Ye.p47rIssjVUwQIqixGMATDFQPkVHqiFxMJJCaINz7wuiQxb6A-- X-Mailer: YahooMailWebService/0.8.115.331698 References: <1323915306.26558.YahooMailClassic@web15904.mail.cnb.yahoo.com> Message-ID: <1323951222.19213.YahooMailNeo@web121203.mail.ne1.yahoo.com> Date: Thu, 15 Dec 2011 04:13:42 -0800 (PST) From: Bejoy Ks Reply-To: Bejoy Ks Subject: Re: bucketing in hive To: "user@hive.apache.org" , hive dev list In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1087320363-1155344305-1323951222=:19213" --1087320363-1155344305-1323951222=:19213 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Ranjith=0A=C2=A0=C2=A0=C2=A0 I'm not aware of any Dynamic Bucketing in h= ive where as there is definitely=C2=A0 Dynamic Partitions available. Your p= artitions/sub partitions would be generated on the fly/dynamically based on= the value of a particular column .The records with same values for that co= lumn would go into the same partition. But=C2=A0 Dynamic Partition load can= 't happen with a LOAD DATA statement as it requires running mapreduce job, = You can utilize dynamic partitions in 2 steps for delimited files=0A- Load = delimited file into a non partitioned table in hive using LOAD DATA=0A=0A- = Load data into destination table from the source table using INSERT OVERWRI= TE - here a MR job would be triggered that would do the job for you.=0A=0AI= have scribbled something down on the same, check whether it'd be useful fo= r you.=0Ahttp://kickstarthadoop.blogspot.com/2011/06/how-to-speed-up-your-h= ive-queries-in.html=0A=0ARegards=0ABejoy.K.S=0A=0A=0A=0A___________________= _____________=0A From: "Raghunath, Ranjith" = =0ATo: "user@hive.apache.org" ; hive dev list =0ASent: Thursday, December 15, 2011 7:53 AM=0ASubject: buck= eting in hive=0A =0A=0A =0ACan one use bucketing in hive to emulate hash pa= rtitions on a database? Is there also a way to segment data into buckets dy= namically based on values in the column. For example, =0A=C2=A0=0ACol1 =C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Col2=0AApple=C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 1=0AOrange =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 2=0AApple =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 2=0ABanana=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1=0A=C2=A0=0AIf the fil= e above were inserted into a table with Col1 as the bucket column, can we d= ynamically allow all of the rows with =E2=80=9CApple=E2=80=9D in one file a= nd =E2=80=9COrange=E2=80=9D in one file and so on. Is there a way to do thi= s without specifying the bucket size to be 3. =0AThank you, =0ARanjith --1087320363-1155344305-1323951222=:19213--