hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hamza Asad <hamza.asa...@gmail.com>
Subject Re: dynamic Partition not splitting properly
Date Fri, 14 Jun 2013 08:38:52 GMT
ok .. got it.. Thanx :)
p.s Nitin, have u any idea of indexes? i have emailed issue with
subject "*index
not working*", can u see it please?


On Fri, Jun 14, 2013 at 1:30 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:

> thats a wrong query
>
> insert into table table_name partition (partition_names) select cols,
> do_data(event_date) from table
>
> this is how it should look like
> hive will take care of inserting into respective partitions after you
> enable dynamic partitions
>
>
> On Fri, Jun 14, 2013 at 1:21 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>
>> i 'm executing following command but it fail to recognize partition column
>> *INSERT INTO TABLE rc_partition_table
>> PARTITION (to_date(event_date))
>> SELECT * FROM events_details*
>>
>>
>> On Fri, Jun 14, 2013 at 12:41 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:
>>
>>> just use hive split function for strings and get the value extracted.
>>> by the way I am not sure why the to_date function is failing
>>> stringto_date(string timestamp)Returns the date part of a timestamp
>>> string: to_date("1970-01-01 00:00:00") = "1970-01-01"
>>> also, I am sorry i might have misread your question
>>> what do you mean by "partition column does not accepts
>>> to_date(event_date) form "
>>>
>>>
>>>
>>> On Fri, Jun 14, 2013 at 1:04 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>>>
>>>> sample row of my data is
>>>> *591269735,1,1022,2012-06-24
>>>> 11:08:10.9,null,2,null,null,null,null,null,null,null,null,12,null,null,2,null,null,null,null,4,1,null,null,null,null,null,null
>>>> *
>>>>
>>>> and i want to partition it according to date i.e *2012-06-24 *skiping
>>>> the hh:mm:ss.ff part
>>>>
>>>>
>>>> On Fri, Jun 14, 2013 at 12:27 PM, Nitin Pawar <nitinpawar432@gmail.com>wrote:
>>>>
>>>>> can you provide whats your data and what you want it to look like ?
>>>>>
>>>>>
>>>>> On Fri, Jun 14, 2013 at 12:31 PM, Hamza Asad <hamza.asad13@gmail.com>wrote:
>>>>>
>>>>>> which UDF? it does not take to_date(event_date) column
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 14, 2013 at 11:54 AM, Nitin Pawar <
>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>
>>>>>>> use already existing UDFs to split or transform your values the
way
>>>>>>> you want
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 14, 2013 at 12:09 PM, Hamza Asad <hamza.asad13@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> OIC. I got it. Thanx alot nitin :). One more thing i want
to ask
>>>>>>>> related this issue, if old table contains event_date in format
"2012-06-24
>>>>>>>> 06:04:11.9" then how can i partition it according to date
part only? As
>>>>>>>> partition column does not accepts to_date(event_date) form.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 13, 2013 at 5:07 PM, Nitin Pawar <
>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> If the input column value is NULL or empty string, the
row will be put into a special partition, whose name is controlled by the hive parameter hive.exec.default.dynamic.partition.name.
The default value is `__HIVE_DEFAULT_PARTITION__`. Basically this partition will contain all
>>>>>>>>> "bad" rows whose value are not valid partition names.
>>>>>>>>>
>>>>>>>>> so basically you do following things
>>>>>>>>>
>>>>>>>>> when you create a partitioned table, your partitioned
column is normally at the end of the table, so when you are inserting data into this partitioned
table, I would recommend using the column names in place select * from
>>>>>>>>>
>>>>>>>>> so your insert query should look like
>>>>>>>>>
>>>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> insert overwrite table new_table partition(event_date)
select col1, col2 .... coln, event_date from old_table;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad <
>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> when i browse it in browser, all the data is in *
>>>>>>>>>> event_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070>
>>>>>>>>>> *, rest of the files does not contains data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 13, 2013 at 4:52 PM, Nitin Pawar <
>>>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> what do you mean when you say "it wont split
correctly" ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad <
>>>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> what if i have data of more then 500 days
then how can i create
>>>>>>>>>>>> partition on date column by specifying each
and every date? (i knw that
>>>>>>>>>>>> does not happens in dynamic partition but
on dynamic partition, it wont
>>>>>>>>>>>> splits correctly).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar
<
>>>>>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> you can partition existing table unless
the hdfs data is laid
>>>>>>>>>>>>> out in partitioned fashion.
>>>>>>>>>>>>> your best bet is create a new partitioned
table
>>>>>>>>>>>>> enable dynamic paritionining
>>>>>>>>>>>>> read from old table and write into new
table
>>>>>>>>>>>>>
>>>>>>>>>>>>> you can verify the new partitions by
using command "show
>>>>>>>>>>>>> partitions"
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jun 13, 2013 at 4:40 PM, Hamza
Asad <
>>>>>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> now i created partition table like
>>>>>>>>>>>>>> *CREATE TABLE new_rc_partition_cluster_table(
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   id int,
>>>>>>>>>>>>>>   event_id int,
>>>>>>>>>>>>>>   user_id BIGINT,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   intval_1 int ,
>>>>>>>>>>>>>>   intval_2 int,
>>>>>>>>>>>>>>   intval_3 int,
>>>>>>>>>>>>>>   intval_4 int,
>>>>>>>>>>>>>>   intval_5 int,
>>>>>>>>>>>>>>   intval_6 int,
>>>>>>>>>>>>>>   intval_7 int,
>>>>>>>>>>>>>>   intval_8 int,
>>>>>>>>>>>>>>   intval_9 int,
>>>>>>>>>>>>>>   intval_10 int,
>>>>>>>>>>>>>>   intval_11 int,
>>>>>>>>>>>>>>   intval_12 int,
>>>>>>>>>>>>>>   intval_13 int,
>>>>>>>>>>>>>>   intval_14 int,
>>>>>>>>>>>>>>   intval_15 int,
>>>>>>>>>>>>>>   intval_16 int,
>>>>>>>>>>>>>>   intval_17 int,
>>>>>>>>>>>>>>   intval_18 int,
>>>>>>>>>>>>>>   intval_19 int,
>>>>>>>>>>>>>>   intval_20 int,
>>>>>>>>>>>>>>   intval_21 int,
>>>>>>>>>>>>>>   intval_22 int,
>>>>>>>>>>>>>>   intval_23 int,
>>>>>>>>>>>>>>   intval_24 int,
>>>>>>>>>>>>>>   intval_25 int,
>>>>>>>>>>>>>>   intval_26 int)
>>>>>>>>>>>>>>   PARTITIONED BY (event_date string)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS
>>>>>>>>>>>>>> ROW FORMAT DELIMITED
>>>>>>>>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>>>>>>>>> STORED AS RCFile;
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> rest of the commands are same. But
this time INSERT OVERWRITE
>>>>>>>>>>>>>> query executed and took time but
when i queries from that table, it results
>>>>>>>>>>>>>> none as it does not contains data.
why is this so? and also please tell me
>>>>>>>>>>>>>> how can i partition my existing table
dynamicaly on date so that data
>>>>>>>>>>>>>> splits equally without mentioning
date explicitly?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 12, 2013 at 6:50 PM,
Nitin Pawar <
>>>>>>>>>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> you did not create partitioned
table. You just created a
>>>>>>>>>>>>>>> bucketed table.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> refer to partitioned table created
>>>>>>>>>>>>>>> something like
>>>>>>>>>>>>>>> partitioned by (event_date string)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jun 12, 2013 at 7:17
PM, Hamza Asad <
>>>>>>>>>>>>>>> hamza.asad13@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> i have created table after
enabling dynamic partition. i
>>>>>>>>>>>>>>>> partitioned it on date but
it is not splitting data datewise. Below is the
>>>>>>>>>>>>>>>> query of table created and
data insert
>>>>>>>>>>>>>>>> CREATE TABLE rc_partition_cluster_table(
>>>>>>>>>>>>>>>>   id int,
>>>>>>>>>>>>>>>>   event_id int,
>>>>>>>>>>>>>>>>   user_id BIGINT,
>>>>>>>>>>>>>>>>   event_date string,
>>>>>>>>>>>>>>>>   intval_1 int )
>>>>>>>>>>>>>>>> CLUSTERED BY(id) INTO 256
BUCKETS
>>>>>>>>>>>>>>>> ROW FORMAT DELIMITED
>>>>>>>>>>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>>>>>>>>>>> STORED AS RCFile;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>>>>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict;
>>>>>>>>>>>>>>>> set hive.exec.max.dynamic.partitions=1000;
>>>>>>>>>>>>>>>> set hive.exec.max.dynamic.partitions.pernode=1000;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> INSERT OVERWRITE TABLE rc_partition_cluster_table
Partition
>>>>>>>>>>>>>>>> (event_date)
>>>>>>>>>>>>>>>> SELECT * FROM events_details;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> why it is not working fine?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Nitin Pawar
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nitin Pawar
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Muhammad Hamza Asad*
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nitin Pawar
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Muhammad Hamza Asad*
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>>
>> --
>> *Muhammad Hamza Asad*
>>
>
>
>
> --
> Nitin Pawar
>



-- 
*Muhammad Hamza Asad*

Mime
View raw message