hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasad Chakka (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-91) Allow external tables with different partition directory structure
Date Tue, 13 Jan 2009 22:44:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663520#action_12663520
] 

Prasad Chakka commented on HIVE-91:
-----------------------------------

Hive.java:
Can you move the logic of creating new partition into Partition.java (as a new constructor
method)? I would like to isolate partition creation code into single class.
559:560 -> use log for printing and also throw an exception back

DDLSemanticAnalyzer.java:
In semantic analysis of the query we just build up the description of the input into a temporary
structure and leave the actual creation of Partition objects into DDLTask. Look at analyzeCreateTable
method. 

DDLTask.java:
Usually hive.metastore interfaces are not exposed to hive.ql except for hive.ql.metadata.
Rest of hive.ql just use hive.ql.metadata to access metadata functionality (there are couple
of instances where we hive.metastore is directly used in hive.ql but they shouldn't be unless
they are simple model objects without any logic). It may be cleaner if DDLTask calls Hive.addPartition(tbl,
part_vals, location) and let Hive.java take care of creating partition object and making metastore
call. 

Also, tbl.isExternal() can be moved out of the for loop. BTB, why do we want to restrict this
to external tables only? The same code can be used in cases where user creates the partition
data in the location that internal tables expect but wants to add metadata right?




> Allow external tables with different partition directory structure
> ------------------------------------------------------------------
>
>                 Key: HIVE-91
>                 URL: https://issues.apache.org/jira/browse/HIVE-91
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>         Attachments: HIVE-91.patch
>
>
> A lot of users have datasets in a directory structures similar to this in hdfs: /dataset/yyyy/MM/dd/<one
or more files>
> Instead of loading these into Hive the normal way it would be useful to create an external
table with the /dataset location and then one partition per yyyy/mm/dd. This would require
the partition "naming to directory"-function to be made more flexible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message