hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hamza Asad <hamza.asa...@gmail.com>
Subject Re: dynamic Partition not splitting properly
Date Fri, 14 Jun 2013 07:51:18 GMT
i 'm executing following command but it fail to recognize partition column
*INSERT INTO TABLE rc_partition_table
PARTITION (to_date(event_date))
SELECT * FROM events_details*


On Fri, Jun 14, 2013 at 12:41 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:

> just use hive split function for strings and get the value extracted.
> by the way I am not sure why the to_date function is failing
> stringto_date(string timestamp)Returns the date part of a timestamp
> string: to_date("1970-01-01 00:00:00") = "1970-01-01"
> also, I am sorry i might have misread your question
> what do you mean by "partition column does not accepts
> to_date(event_date) form "
>
>
>
> On Fri, Jun 14, 2013 at 1:04 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>
>> sample row of my data is
>> *591269735,1,1022,2012-06-24
>> 11:08:10.9,null,2,null,null,null,null,null,null,null,null,12,null,null,2,null,null,null,null,4,1,null,null,null,null,null,null
>> *
>>
>> and i want to partition it according to date i.e *2012-06-24 *skiping
>> the hh:mm:ss.ff part
>>
>>
>> On Fri, Jun 14, 2013 at 12:27 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:
>>
>>> can you provide whats your data and what you want it to look like ?
>>>
>>>
>>> On Fri, Jun 14, 2013 at 12:31 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>>>
>>>> which UDF? it does not take to_date(event_date) column
>>>>
>>>>
>>>> On Fri, Jun 14, 2013 at 11:54 AM, Nitin Pawar <nitinpawar432@gmail.com>wrote:
>>>>
>>>>> use already existing UDFs to split or transform your values the way
>>>>> you want
>>>>>
>>>>>
>>>>> On Fri, Jun 14, 2013 at 12:09 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>>>>>
>>>>>> OIC. I got it. Thanx alot nitin :). One more thing i want to ask
>>>>>> related this issue, if old table contains event_date in format "2012-06-24
>>>>>> 06:04:11.9" then how can i partition it according to date part only?
As
>>>>>> partition column does not accepts to_date(event_date) form.
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 13, 2013 at 5:07 PM, Nitin Pawar <nitinpawar432@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> If the input column value is NULL or empty string, the row will
be put into a special partition, whose name is controlled by the hive parameter hive.exec.default.dynamic.partition.name.
The default value is `__HIVE_DEFAULT_PARTITION__`. Basically this partition will contain all
>>>>>>> "bad" rows whose value are not valid partition names.
>>>>>>>
>>>>>>> so basically you do following things
>>>>>>>
>>>>>>> when you create a partitioned table, your partitioned column
is normally at the end of the table, so when you are inserting data into this partitioned
table, I would recommend using the column names in place select * from
>>>>>>>
>>>>>>> so your insert query should look like
>>>>>>>
>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> insert overwrite table new_table partition(event_date) select
col1, col2 .... coln, event_date from old_table;
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>>>>>>>
>>>>>>>> when i browse it in browser, all the data is in *
>>>>>>>> event_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070>
>>>>>>>> *, rest of the files does not contains data
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 13, 2013 at 4:52 PM, Nitin Pawar <
>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> what do you mean when you say "it wont split correctly"
?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad <
>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> what if i have data of more then 500 days then how
can i create
>>>>>>>>>> partition on date column by specifying each and every
date? (i knw that
>>>>>>>>>> does not happens in dynamic partition but on dynamic
partition, it wont
>>>>>>>>>> splits correctly).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar <
>>>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> you can partition existing table unless the hdfs
data is laid
>>>>>>>>>>> out in partitioned fashion.
>>>>>>>>>>> your best bet is create a new partitioned table
>>>>>>>>>>> enable dynamic paritionining
>>>>>>>>>>> read from old table and write into new table
>>>>>>>>>>>
>>>>>>>>>>> you can verify the new partitions by using command
"show
>>>>>>>>>>> partitions"
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 13, 2013 at 4:40 PM, Hamza Asad <
>>>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> now i created partition table like
>>>>>>>>>>>> *CREATE TABLE new_rc_partition_cluster_table(
>>>>>>>>>>>>
>>>>>>>>>>>>   id int,
>>>>>>>>>>>>   event_id int,
>>>>>>>>>>>>   user_id BIGINT,
>>>>>>>>>>>>
>>>>>>>>>>>>   intval_1 int ,
>>>>>>>>>>>>   intval_2 int,
>>>>>>>>>>>>   intval_3 int,
>>>>>>>>>>>>   intval_4 int,
>>>>>>>>>>>>   intval_5 int,
>>>>>>>>>>>>   intval_6 int,
>>>>>>>>>>>>   intval_7 int,
>>>>>>>>>>>>   intval_8 int,
>>>>>>>>>>>>   intval_9 int,
>>>>>>>>>>>>   intval_10 int,
>>>>>>>>>>>>   intval_11 int,
>>>>>>>>>>>>   intval_12 int,
>>>>>>>>>>>>   intval_13 int,
>>>>>>>>>>>>   intval_14 int,
>>>>>>>>>>>>   intval_15 int,
>>>>>>>>>>>>   intval_16 int,
>>>>>>>>>>>>   intval_17 int,
>>>>>>>>>>>>   intval_18 int,
>>>>>>>>>>>>   intval_19 int,
>>>>>>>>>>>>   intval_20 int,
>>>>>>>>>>>>   intval_21 int,
>>>>>>>>>>>>   intval_22 int,
>>>>>>>>>>>>   intval_23 int,
>>>>>>>>>>>>   intval_24 int,
>>>>>>>>>>>>   intval_25 int,
>>>>>>>>>>>>   intval_26 int)
>>>>>>>>>>>>   PARTITIONED BY (event_date string)
>>>>>>>>>>>>
>>>>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS
>>>>>>>>>>>> ROW FORMAT DELIMITED
>>>>>>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>>>>>>> STORED AS RCFile;
>>>>>>>>>>>> *
>>>>>>>>>>>>
>>>>>>>>>>>> rest of the commands are same. But this time
INSERT OVERWRITE
>>>>>>>>>>>> query executed and took time but when i queries
from that table, it results
>>>>>>>>>>>> none as it does not contains data. why is
this so? and also please tell me
>>>>>>>>>>>> how can i partition my existing table dynamicaly
on date so that data
>>>>>>>>>>>> splits equally without mentioning date explicitly?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 12, 2013 at 6:50 PM, Nitin Pawar
<
>>>>>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> you did not create partitioned table.
You just created a
>>>>>>>>>>>>> bucketed table.
>>>>>>>>>>>>>
>>>>>>>>>>>>> refer to partitioned table created
>>>>>>>>>>>>> something like
>>>>>>>>>>>>> partitioned by (event_date string)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 12, 2013 at 7:17 PM, Hamza
Asad <
>>>>>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> i have created table after enabling
dynamic partition. i
>>>>>>>>>>>>>> partitioned it on date but it is
not splitting data datewise. Below is the
>>>>>>>>>>>>>> query of table created and data insert
>>>>>>>>>>>>>> CREATE TABLE rc_partition_cluster_table(
>>>>>>>>>>>>>>   id int,
>>>>>>>>>>>>>>   event_id int,
>>>>>>>>>>>>>>   user_id BIGINT,
>>>>>>>>>>>>>>   event_date string,
>>>>>>>>>>>>>>   intval_1 int )
>>>>>>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS
>>>>>>>>>>>>>> ROW FORMAT DELIMITED
>>>>>>>>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>>>>>>>>> STORED AS RCFile;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict;
>>>>>>>>>>>>>> set hive.exec.max.dynamic.partitions=1000;
>>>>>>>>>>>>>> set hive.exec.max.dynamic.partitions.pernode=1000;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> INSERT OVERWRITE TABLE rc_partition_cluster_table
Partition
>>>>>>>>>>>>>> (event_date)
>>>>>>>>>>>>>> SELECT * FROM events_details;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> why it is not working fine?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Nitin Pawar
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nitin Pawar
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Muhammad Hamza Asad*
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nitin Pawar
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Muhammad Hamza Asad*
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>>
>> --
>> *Muhammad Hamza Asad*
>>
>
>
>
> --
> Nitin Pawar
>



-- 
*Muhammad Hamza Asad*

Mime
View raw message