hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "darren (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-12314) "insert overwrite" produce redundant directory while multiple execution
Date Mon, 02 Nov 2015 02:58:27 GMT
darren created HIVE-12314:
-----------------------------

             Summary: "insert overwrite" produce redundant directory while multiple execution
                 Key: HIVE-12314
                 URL: https://issues.apache.org/jira/browse/HIVE-12314
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.1.0, 0.13.0
            Reporter: darren


1)Perform the following command for the first time:
INSERT OVERWRITE TABLE dest PARTITION (dt='20151026') SELECT * FROM src;

Once it fails while trying to add partition into meta data,though the data file has been copied
to the table directory.

hdfs dfs -ls -R /user/hive/warehouse/dest/dt=20151026
-rw------- 3 admin hive 65 2015-10-30 19:34 /user/hive/warehouse/dest/dt=20151026/000000_0
0: jdbc:hive2://ha-cluster/default> show partitions dest;
+------------+
| partition |
+------------+
+------------+
No rows selected (0.154 seconds)

2)Perform the "insert overwrite" again:
INSERT OVERWRITE TABLE dest PARTITION (dt='20151026') SELECT * FROM src;

No matter if this time it succeeds or not,the partition directory will get redundant directory
just like the following example:

hdfs dfs -ls -R /user/hive/warehouse/dest/ 
drwx------ - admin hive 0 2015-10-30 19:36 /user/hive/warehouse/dest/dt=20151026
-rw------- 3 admin hive 65 2015-10-30 19:34 /user/hive/warehouse/dest/dt=20151026/000000_0
drwxrwxrwx - admin hive 0 2015-10-30 19:36 /user/hive/warehouse/dest/dt=20151026/-ext-10000
-rw------- 3 admin hive 65 2015-10-30 19:36 /user/hive/warehouse/dest/dt=20151026/-ext-10000/000000_0

3)This will cause a issue while try to select data from it.
0: jdbc:hive2://ha-cluster/default> select * from dest where dt='20151026';
Error: java.io.IOException: java.io.IOException: Not a file: hdfs://hacluster/user/hive/warehouse/dest/dt=20151026/-ext-10000
(state=,code=0)

4)This issue turns different result for Hive-0.13 and Hive-1.1.0.
For Hive-0.13,it produces redundant directory.
For Hive-1.10,it generates duplicated data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message