hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johan Oskarsson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-142) Create a metastore check command
Date Wed, 10 Dec 2008 12:26:44 GMT

    [ https://issues.apache.org/jira/browse/HIVE-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655186#action_12655186
] 

Johan Oskarsson commented on HIVE-142:
--------------------------------------

In a very basic initial version I thought about implementing the following:
* Check that the following have directories on HDFS
** Tables
** The partitions in the tables (also check that they contain files)
* Check for partitions on HDFS that are unknown to the Metastore

The first version will be read only.
If I understood the comments in HIVE-126 the code that does the actual checking should be
called server side from HiveMetaStore.java?

> Create a metastore check command
> --------------------------------
>
>                 Key: HIVE-142
>                 URL: https://issues.apache.org/jira/browse/HIVE-142
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore
>    Affects Versions: 0.19.0
>            Reporter: Johan Oskarsson
>             Fix For: 0.19.0
>
>
> We need a command to verify that the information in the metastore reflects the data that
is on hdfs. For example partitions can be deleted on hdfs but still be in the metastore.
> From Joydeep Sen Sarma, see ticket HIVE-126 for the full comment:
> for a command line interface - one might want to check the entire database or just a
table or even just one partition. other metadata checks will also be added over time (for
example - do the file types on disk agree with metadata records, bucketing information etc).
So, here's a strawman proposal for a new command:
> alter table <DB>[.TABLE [PARTITION-SPEC]] check [TYPE-LIST]
> where TYPE by default is 'all' (check for all kinds of errors), but can be specified
to a specific type. For example - in this case - we can have a type called 'partitions' (and
then over time we can add other types like 'fileformat' etc.). for v1 - we can just drop the
type-list altogether.
> the check command can produce a list of things that need to be done to fix the format
(like adding any directories not in the metastore - but in hdfs - to the metastore). actually
performing of such steps would require a user confirmation (y/n).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message