hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amogh Antarkar (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-6589) Automatically add partitions for external tables
Date Mon, 05 Aug 2019 20:14:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900386#comment-16900386
] 

Amogh Antarkar edited comment on HIVE-6589 at 8/5/19 8:13 PM:
--------------------------------------------------------------

Please let us know if there are any updates on this issue or any workarounds to it. Since
Flume is used to stream into hdfs, it would be great if we do not have to run a separate
batch job to detect partitions for hive table. Any help appreciated!

 

 


was (Author: amoghantarkar):
Please let me know if there are any updates on this issue or any workarounds to it. Since
Flume is used to stream into hdfs, it would be great if we do not have to run a separate
batch job to detect partitions for hive table. Any help appreciated!

 

 

> Automatically add partitions for external tables
> ------------------------------------------------
>
>                 Key: HIVE-6589
>                 URL: https://issues.apache.org/jira/browse/HIVE-6589
>             Project: Hive
>          Issue Type: New Feature
>    Affects Versions: 0.14.0
>            Reporter: Ken Dallmeyer
>            Priority: Major
>
> I have a data stream being loaded into Hadoop via Flume. It loads into a date partition
folder in HDFS.  The path looks like this:
> {code}/flume/my_data/YYYY/MM/DD/HH
> /flume/my_data/2014/03/02/01
> /flume/my_data/2014/03/02/02
> /flume/my_data/2014/03/02/03{code}
> On top of it I create an EXTERNAL hive table to do querying.  As of now, I have to manually
add partitions.  What I want is for EXTERNAL tables, Hive should "discover" those partitions.
 Additionally I would like to specify a partition pattern so that when I query Hive will know
to use the partition pattern to find the HDFS folder.
> So something like this:
> {code}CREATE EXTERNAL TABLE my_data (
>   col1 STRING,
>   col2 INT
> )
> PARTITIONED BY (
>   dt STRING,
>   hour STRING
> )
> LOCATION 
>   '/flume/mydata'
> TBLPROPERTIES (
>   'hive.partition.spec' = 'dt=$Y-$M-$D, hour=$H',
>   'hive.partition.spec.location' = '$Y/$M/$D/$H',
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message