Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 381BCCB2F for ; Tue, 4 Jun 2013 07:53:19 +0000 (UTC) Received: (qmail 17948 invoked by uid 500); 4 Jun 2013 07:53:17 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 17897 invoked by uid 500); 4 Jun 2013 07:53:17 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 17889 invoked by uid 99); 4 Jun 2013 07:53:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2013 07:53:16 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of hamza.asad13@gmail.com designates 209.85.223.173 as permitted sender) Received: from [209.85.223.173] (HELO mail-ie0-f173.google.com) (209.85.223.173) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2013 07:53:09 +0000 Received: by mail-ie0-f173.google.com with SMTP id k13so13275839iea.18 for ; Tue, 04 Jun 2013 00:52:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=5BI4cZoXOAywKhJ1XjqsvDKVFGxeLNfXz4wg143pgLM=; b=OTxVGV9jUmNcNEN/e31V0aaxi2Jox3OGzv6iJBb0fCg/VU8PQ7heTWkjeSwVSKfeua VgR3aPRbtzwCJlg91eqJBfx+0fvi4fs2iy89fJM/0u/Ie9MLQhthLhZ84f/nOwQjCyBf ca87A7uWdLpMocimUvq/c9cCtcjpRV5QICTOYLn9/AcBcjntMBqKoVW2waC4KBjKDa7G gwWMnLQYIG7RUemT1RMDVRGMQSVG/YCTMrC8U1dmu7YLyfC+7IR4X0yYhE8h3MDlHUtL fUbkhWtva8mkaR2/ZTz3ryJf2N6ldvrrMKagEHafvgZZ3nDfojPje6AtEUs3j3GSA0nB Hw3w== MIME-Version: 1.0 X-Received: by 10.50.44.17 with SMTP id a17mr126386igm.1.1370332369303; Tue, 04 Jun 2013 00:52:49 -0700 (PDT) Received: by 10.43.3.74 with HTTP; Tue, 4 Jun 2013 00:52:49 -0700 (PDT) In-Reply-To: References: Date: Tue, 4 Jun 2013 12:52:49 +0500 Message-ID: Subject: Re: How to delete Specific date data using hive QL? From: Hamza Asad To: user@hive.apache.org Content-Type: multipart/alternative; boundary=047d7bdc118418c02404de4f5edf X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdc118418c02404de4f5edf Content-Type: text/plain; charset=ISO-8859-1 Thank u soooo much nitin for your help.. :) On Tue, Jun 4, 2013 at 12:18 PM, Nitin Pawar wrote: > 1- Does partitioning improve performance? > --Only if you make use of partitions in your queries (mostly in where > clause to limit data to your query for a specific value of partitioned > column) > > 2- Do i have to create partition table new or i can create partition on > existing table by renaming that date column and add partition column > event_date (the actual column name) ? > you can not create partitions on already existing data unless the data is > in partitioned directories on hdfs. > I would recommend create a new table with partitioned columns. > load data from old table into partitioned table > dump old table > > 3- can i import data directly into partition table using sqoop command? > you can import data directly into a partition. > > for exported data, you don't have to worry. it remains as it is > > > On Tue, Jun 4, 2013 at 12:41 PM, Hamza Asad wrote: > >> No i don't want to change my queries. I want that my queries work on same >> table and partition does not change its schema. >> and from schema i means schema on mysql (exported data). >> >> Few more things >> 1- Does partitioning improve performance? >> 2- Do i have to create partition table new or i can create partition on >> existing table by renaming that date column and add partition column >> event_date (the actual column name) ? >> 3- can i import data directly into partition table using sqoop command? >> >> >> >> >> On Tue, Jun 4, 2013 at 11:40 AM, Nitin Pawar wrote: >> >>> partitioning of data in hive is more for the reasons on how you layout >>> data in a well defined manner so that when you access your data , you >>> request only for specific data by specifying the partition columns in where >>> clause. >>> >>> to answer your question, >>> do you have to change your queries? out of the box the queries should >>> work as it is unless and until you are changing the table schema by >>> removing/adding new columns. >>> does the format change when you export data? if your select statement is >>> not changing it will not change >>> will table schema change? do you mean schema on hive or mysql ? >>> >>> >>> On Tue, Jun 4, 2013 at 11:37 AM, Hamza Asad wrote: >>> >>>> thats far more better :) .. >>>> Please tell me few more things. Do i have to change my query if i >>>> create table with partition on date? rest of the columns would be same as >>>> it is? Also if i export that partitioned table to mysql, does schema of >>>> that table would same as it was before partition? >>>> >>>> >>>> On Tue, Jun 4, 2013 at 12:09 AM, Stephen Sprague wrote: >>>> >>>>> there is no delete semantic. >>>>> >>>>> you either partition on the data you want to drop and use drop >>>>> partition (or drop table for the whole shebang) or you can do as Nitin >>>>> suggests by selecting the inverse of the data you want to delete and store >>>>> it back into the table itself. Not ideal but maybe it could work for your >>>>> situation. >>>>> >>>>> Now here's another idea. This was just _recently_ discussed on this >>>>> group as coincidence would have it. if you were to have scanned just a >>>>> little of the groups messages you would have seen that and could then have >>>>> added to the discussion! :) >>>>> >>>>> >>>>> On Mon, Jun 3, 2013 at 2:19 AM, Hamza Asad wrote: >>>>> >>>>>> Thanx for your response nitin. Anybody else have any better solution? >>>>>> >>>>>> >>>>>> On Mon, Jun 3, 2013 at 1:27 PM, Nitin Pawar wrote: >>>>>> >>>>>>> hive does not give you a record level deletion as of now. >>>>>>> >>>>>>> so unless you have partitioned, other option is you overwrite the >>>>>>> table with data which you want >>>>>>> please wait for others to suggest you more options. this one is just >>>>>>> mine and can be costly too >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 3, 2013 at 12:36 PM, Hamza Asad wrote: >>>>>>> >>>>>>>> no, its not partitioned by date. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jun 3, 2013 at 11:19 AM, Nitin Pawar < >>>>>>>> nitinpawar432@gmail.com> wrote: >>>>>>>> >>>>>>>>> how is the data laid out? >>>>>>>>> is it partitioned data by the date? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jun 3, 2013 at 11:20 AM, Hamza Asad < >>>>>>>>> hamza.asad13@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Dear all, >>>>>>>>>> How can i remove data of specific dates from HDFS >>>>>>>>>> using hive query language? >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> *Muhammad Hamza Asad* >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Nitin Pawar >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Muhammad Hamza Asad* >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Nitin Pawar >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Muhammad Hamza Asad* >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Muhammad Hamza Asad* >>>> >>> >>> >>> >>> -- >>> Nitin Pawar >>> >> >> >> >> -- >> *Muhammad Hamza Asad* >> > > > > -- > Nitin Pawar > -- *Muhammad Hamza Asad* --047d7bdc118418c02404de4f5edf Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Thank u soooo much nitin for your help.. :)


On Tue, Jun 4, 201= 3 at 12:18 PM, Nitin Pawar <nitinpawar432@gmail.com> w= rote:
1- Does partitioning impr= ove performance?
--On= ly if you make use of partitions in your queries (mostly in where clause to= limit data to your query for a specific value of partitioned column)

2- Do i have to create= partition table new or i can create partition on existing table by renamin= g that date column and add partition column event_date (the actual column n= ame) ?
you = can not create partitions on already existing data unless the data is in pa= rtitioned directories on hdfs.=A0
I would recommend create a new table with partitioned columns.=A0
load data from old t= able into partitioned table
dump old table=A0

3- can i import data directly into partition table using= sqoop command?
you can im= port data directly into a partition.=A0

for exported data, you don't have to worry. it remains as it is=A0


On Tue, Jun 4, 2013 at 12:41 PM, Hamza Asad <hamza.a= sad13@gmail.com> wrote:
No i don't want to= change my queries. I want that my queries work on same table and partition= does not change its schema.
and from schema i means schema on mysql (exported data).

Few more things
1- Does partitioning improve performance?
=
2- Do i have to create partition table new or i can create parti= tion on existing table by renaming that date column and add partition colum= n event_date (the actual column name) ?
3- can i import data directly into partition table using sqoop c= ommand?


<= div>

On Tue, Jun 4, 2013 at 11:40 AM, Nit= in Pawar <nitinpawar432@gmail.com> wrote:
partitioning of data in hiv= e is more for the reasons on how you layout data in a well defined manner s= o that when you access your data , you request only for specific data by sp= ecifying the partition columns in where clause.

to answer your question,=A0
do you have to change = your queries? out of the box the queries should work as it is unless and un= til you are changing the table schema by removing/adding new columns.=A0
does the format change when you export data? if your select statement = is not changing it will not change
will table schema change? do y= ou mean schema on hive or mysql ?=A0


On Tue, Jun 4, 2013 at 11:37 AM, Hamza A= sad <hamza.asad13@gmail.com> wrote:
thats far more better :) ..
Please tell me f= ew more things. Do i have to change my query if i create table with partiti= on on date? rest of the columns would be same as it is? Also if i export th= at partitioned table to mysql, does schema of that table would same as it w= as before partition?


On Tue, Jun 4, 2013 at 12:09 AM, Stephen Sprague <<= a href=3D"mailto:spragues@gmail.com" target=3D"_blank">spragues@gmail.com> wrote:
there is no delet= e semantic.

you either partition on the data you want to drop = and use drop partition (or drop table for the whole shebang) or you can do = as Nitin suggests by selecting the inverse of the data you want to delete a= nd store it back into the table itself.=A0 Not ideal but maybe it could wor= k for your situation.

Now here's another idea.=A0 This was just _recently_ discusse= d on this group as coincidence would have it.=A0 if you were to have scanne= d just a little of the groups messages you would have seen that and could t= hen have added to the discussion! :) =A0


On Mon, Jun 3, 2013 at 2:19 AM, Hamza Asad <hamza.asad13@gmail.com= > wrote:
Thanx for your response nit= in. Anybody else have any better solution?


On Mon, Jun 3, 2013 at 1:27 PM= , Nitin Pawar <nitinpawar432@gmail.com> wrote:
hive does not give you a re= cord level deletion as of now.=A0

so unless you have par= titioned, other option is you overwrite the table with data which you want= =A0
please wait for others to suggest you more options. this one is just m= ine and can be costly too=A0


On Mon, Jun 3, 2013 at 12:36 PM, Hamza Asad <hamza.asad13@gmail.co= m> wrote:
no, its not partitioned by = date.


On Mon, Jun 3, 2013 at 11:19 AM, Nitin Pawar <nitinpawar432@gmail.c= om> wrote:
how is the data laid out?= =A0
is it partitioned data by the date?=A0


On Mon, Jun 3, 2013 at 11:20 A= M, Hamza Asad <hamza.asad13@gmail.com> wrote:
Dear all,
=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 How can i remove data of specific dates from= HDFS using hive query language?

--
Muhammad Ham= za Asad



<= font color=3D"#888888">--
Nitin Pawar



--
Muhammad Hamza Asad



<= font color=3D"#888888">--
Nitin Pawar



--
Muhammad Hamza Asad




= --
Muhammad Hamza Asad



<= font color=3D"#888888">--
Nitin Pawar



--
Muhammad Hamza Asad



--
Nitin Pawar



--
<= b style=3D"color:rgb(102,102,102);font-family:georgia,serif">Muhammad Ha= mza Asad
--047d7bdc118418c02404de4f5edf--