hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhishek Somani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14633) #.of Files in a partition ! = #.Of buckets in a partitioned,bucketed table
Date Tue, 30 Aug 2016 05:48:21 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448135#comment-15448135
] 

Abhishek Somani commented on HIVE-14633:
----------------------------------------

I think this is expected. Insert into will just create those copy files you see, with the
same bucket id as seen above. This is not expected to affect any functionality and hive takes
care of those copies correctly. Others can confirm.

Do you seen any functionality broken due to this?

> #.of Files in a partition ! = #.Of buckets in a partitioned,bucketed table
> --------------------------------------------------------------------------
>
>                 Key: HIVE-14633
>                 URL: https://issues.apache.org/jira/browse/HIVE-14633
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.2.1
>         Environment: HDP 2.3.2
>            Reporter: Hanu
>
> Ideally the number of files should be equal to number of buckets declared in a table
DDL. It is working fine whenever an initial insert or every insert overwrite is performed.
But, insert into hive bucketed table is creating extra files. 
> ex:
> # of Buckets = 4
> No. of files after Initial insert --> 4
> No. of files after 2nd insert --> 8
> No. of files after 3rd insert --> 12
> No. of files after n insert --> n* # of Buckets.
> First insert list : 
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0
> -rwxrwxrwx   3 hvallur hdfs        308 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0
> 2nd Insert:
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0_copy_1
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0_copy_1
> -rwxrwxrwx   3 hvallur hdfs        308 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0
> -rwxrwxrwx   3 hvallur hdfs        302 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0_copy_1
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0_copy_1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message