hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat" <vi...@yahoo-inc.com>
Subject RE: Creating partitions causes Error in semantic analysis
Date Thu, 17 Jun 2010 06:11:35 GMT
Hi Yongqiang,
 I am using the trunk code. I figured out what the problem was 
INSERT OVERWRITE TABLE newtable
PARTITION (newdatestamp, myregion, myproperty)
SELECT
name,
age,
datestamp as newdatestamp,
region as myregion,
property as myproperty,
from oldtable where datestamp='20100525';

I need to specify the last 3 columns in the order of partitions, which I did not. 

Meanwhile the dynamic partitioning produced a partition which was named "__HIVE_DEFAULT_PARTITION__".
Is this created by default? 

Thanks again for your help.
Viraj

-----Original Message-----
From: yongqiang he [mailto:heyongqiangict@gmail.com] 
Sent: Wednesday, June 16, 2010 5:46 PM
To: hive-user@hadoop.apache.org
Subject: Re: Creating partitions causes Error in semantic analysis

Hive supports dynamic partition ( i think you need to use trunk code
for this feature.?).

here is an example:

set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;

create table if not exists nzhang_part1 like srcpart;
create table if not exists nzhang_part2 like srcpart;
describe extended nzhang_part1;

from srcpart
insert overwrite table nzhang_part1 partition (ds, hr) select key,
value, ds, hr where ds <= '2008-04-08'
insert overwrite table nzhang_part2 partition(ds='2008-12-31', hr)
select key, value, hr where ds > '2008-04-08';

On Wed, Jun 16, 2010 at 4:07 PM, Viraj Bhat <viraj@yahoo-inc.com> wrote:
> Hi all,
>
>   I have a table known as "oldtable" which is partitioned by datestamp.
>
>
>
> The schema of the "oldtable" is:
>
>
>
> name string
>
> age bigint
>
> property string
>
> region string
>
> datestamp string
>
>
>
>
>
> I now need to create a new table which is based of this old table and
> partitioned by (datestamp, region, property)
>
>
>
> The DDL for the new table looks like:
>
>
>
> CREATE EXTERNAL TABLE newtable
>
> (
>
> newname string,
>
> newage bigint,
>
> )
>
>
>
> PARTITIONED BY (newdatestamp STRING, myregion STRING, myproperty STRING)
>
>
>
> STORED AS RCFILE
>
> LOCATION '/user/viraj/rcfile;
>
>
>
>
>
> When I try to populate this new table from my old table, I try to use
> partitioning which uses values of old columns.
>
>
>
> INSERT OVERWRITE TABLE newtable
>
> PARTITION (newdatestamp='20100525', region, property)
>
> SELECT
>
>   name,
>
>   age
>
> from oldtable where datestamp='20100525';
>
>
>
> The above statement causes an error and expects hardcoded values for region
> and property.
>
>
>
> FAILED: Error in semantic analysis: Partition column in the partition
> specification does not exist.
>
>
>
> How do I specify the partition information such that the new tables, takes
> values from "property" and "region" from the old table and uses it as
> partitions.
>
>
>
> Is there a better way to achieve the above instead of hard coding values for
> each and every partition?
>
>
>
> =======================================================
>
> Addendum: If the above is possible, how can I define some conditions where I
> need to say, If region is not "us" or "asia", put it in another partition
> known as misc?
>
> =======================================================
>
>
>
>
>
> Thanks Viraj

Mime
View raw message