hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suhail Doshi <digitalwarf...@gmail.com>
Subject Re: Hive Partitioning
Date Thu, 02 Apr 2009 21:23:04 GMT
Interesting, that's semi-problematic, hopefully that gets done at some point
=)

Is there a way to partition existing data that was not partitioned
previously or even re-partition data that was incorrectly partitioned?

Suhail

On Thu, Apr 2, 2009 at 1:23 PM, Raghu Murthy <rmurthy@facebook.com> wrote:

> That is correct. You can currently only load to a single partition at a
> time. We don't route rows to different partitions based on values in
> partition key columns. This feature should be a jira if there isnt one
> already.
>
>
> On 4/2/09 1:21 PM, "Suhail Doshi" <digitalwarfare@gmail.com> wrote:
>
> > So do you have to explicitly say what the partition values are when you
> load
> > data into a table?
> >
> > I am guessing there's no way to dynamically set a value upon LOAD DATA
> for
> > each row since you said function aren't supported yet.
> >
> > On Thu, Apr 2, 2009 at 11:41 AM, Ashish Thusoo <athusoo@facebook.com>
> wrote:
> >> dt is just the name of the partitioning column. PARTITIONED BY just
> contains
> >> the schema information of the partitioning columns.
> >>
> >> Currently, I don't think we support functions while inserting into
> >> partitioning though there is a JIRA open for this...
> >>
> >> https://issues.apache.org/jira/browse/HIVE-50
> >>
> >> Ashish
> >>
> >>
> >> From: digitalwarfare@gmail.com [mailto:digitalwarfare@gmail.com] On
> Behalf Of
> >> Suhail Doshi
> >> Sent: Thursday, April 02, 2009 11:31 AM
> >> To: hive-user@hadoop.apache.org
> >> Subject: Hive Partitioning
> >>
> >> I need some clearing up with regard to partitioning
> >>
> >> CREATE TABLE page_view(viewTime INT, userid BIGINT,
> >>      page_url STRING, referrer_url STRING,
> >>      ip STRING COMMENT 'IP Address of the User')
> >>
> >>
> >>
> >>  COMMENT 'This is the page view table'
> >>  PARTITIONED BY(dt STRING, country STRING)
> >>  STORED AS SEQUENCEFILE;
> >>
> >>
> >> For this statement, what exactly is "dt" in the partition by statement?
> Is it
> >> possible to parition using a DATE() function on say a unix timestamp?
> >>
> >> --
> >> http://mixpanel.com
> >> Blog: http://blog.mixpanel.com
> >
> >
>
>


-- 
http://mixpanel.com
Blog: http://blog.mixpanel.com

Mime
View raw message