hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions
Date Thu, 30 Mar 2017 21:03:41 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vihang Karajgaonkar updated HIVE-16299:
---------------------------------------
    Affects Version/s:     (was: storage-2.2.0)
                       2.2.0

> MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-16299
>                 URL: https://issues.apache.org/jira/browse/HIVE-16299
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 2.2.0
>            Reporter: Dudu Markovitz
>            Assignee: Vihang Karajgaonkar
>            Priority: Minor
>         Attachments: HIVE-16299.01.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, Set<String>
partCols)
> ------------------------------------------------------------------------------------
> MSCK REPAIR validates that any sub-directory is in the format col=val and that there
is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a result false
partitions are being created and so are directories that match those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:	t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup          0 2017-03-26 13:07 /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:	t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup          0 2017-03-26 13:13 /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup          0 2017-03-26 13:13 /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup          0 2017-03-26 13:13 /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:12 /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:12 /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup          0 2017-03-26 13:12 /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message