hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasad Chakka (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-142) Create a metastore check command
Date Tue, 23 Dec 2008 20:44:46 GMT

    [ https://issues.apache.org/jira/browse/HIVE-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658945#action_12658945
] 

Prasad Chakka commented on HIVE-142:
------------------------------------

The code looks good. Couple of comments though

1) msck should just return HiveException. MetaException, TException is (should) not exposed
to code outside of metastore package since the outside code doesn't know how to deal with
it. And also the error returned by the exceptions may not be that much informative for the
user?

2) Can you put a config variable instead of completely removing the HACK that checks HDFS
for partitions? I think this is needed only temporarily until we check our current metadata
and data completely.

3) Can you make the msck command to take more than one partition spec? I don't see any reason
why it should be restricted to just one.

> Create a metastore check command
> --------------------------------
>
>                 Key: HIVE-142
>                 URL: https://issues.apache.org/jira/browse/HIVE-142
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>             Fix For: 0.2.0
>
>         Attachments: HIVE-142.patch
>
>
> We need a command to verify that the information in the metastore reflects the data that
is on hdfs. For example partitions can be deleted on hdfs but still be in the metastore.
> From Joydeep Sen Sarma, see ticket HIVE-126 for the full comment:
> for a command line interface - one might want to check the entire database or just a
table or even just one partition. other metadata checks will also be added over time (for
example - do the file types on disk agree with metadata records, bucketing information etc).
So, here's a strawman proposal for a new command:
> alter table <DB>[.TABLE [PARTITION-SPEC]] check [TYPE-LIST]
> where TYPE by default is 'all' (check for all kinds of errors), but can be specified
to a specific type. For example - in this case - we can have a type called 'partitions' (and
then over time we can add other types like 'fileformat' etc.). for v1 - we can just drop the
type-list altogether.
> the check command can produce a list of things that need to be done to fix the format
(like adding any directories not in the metastore - but in hdfs - to the metastore). actually
performing of such steps would require a user confirmation (y/n).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message