hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naresh P R (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains only base files
Date Tue, 01 Oct 2019 05:21:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941536#comment-16941536
] 

Naresh P R commented on HIVE-22255:
-----------------------------------

[~pvary] As its insert overwrite, i assume we have exclusive lock on the table.
Cleaner thread generally gets triggered for every 5s, if compaction flow is creating new base,
then old base might have been cleared by cleaner thread, however i doubt insert overwrite
is leaving stale base folder which are not getting cleared. 

> Hive don't trigger Major Compaction automatically if table contains only base files 
> ------------------------------------------------------------------------------------
>
>                 Key: HIVE-22255
>                 URL: https://issues.apache.org/jira/browse/HIVE-22255
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Transactions
>    Affects Versions: 3.1.2
>         Environment: Hive-3.1.1
>            Reporter: Rajkumar Singh
>            Assignee: Rajkumar Singh
>            Priority: Major
>
> user may run into the issue if the table consists of all base files but no delta, then
the following condition will yield false and automatic major compaction will be skipped.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313]
>  
> Steps to Reproduce:
>  # create Acid table 
> {code:java}
> //  create table myacid(id int);
> {code}
>  # Run multiple insert table 
> {code:java}
> // insert overwrite table myacid values(1);insert overwrite table myacid values(2),(3),(4){code}
>  # DFS ls output
> {code:java}
> // dfs -ls -R /warehouse/tablespace/managed/hive/myacid;
> +----------------------------------------------------+
> |                     DFS Output                     |
> +----------------------------------------------------+
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:42 /warehouse/tablespace/managed/hive/myacid/base_0000001
|
> | -rw-rw----+  3 hive hadoop          1 2019-09-27 16:42 /warehouse/tablespace/managed/hive/myacid/base_0000001/_orc_acid_version
|
> | -rw-rw----+  3 hive hadoop        610 2019-09-27 16:42 /warehouse/tablespace/managed/hive/myacid/base_0000001/bucket_00000
|
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:43 /warehouse/tablespace/managed/hive/myacid/base_0000002
|
> | -rw-rw----+  3 hive hadoop          1 2019-09-27 16:43 /warehouse/tablespace/managed/hive/myacid/base_0000002/_orc_acid_version
|
> | -rw-rw----+  3 hive hadoop        633 2019-09-27 16:43 /warehouse/tablespace/managed/hive/myacid/base_0000002/bucket_00000
|
> +----------------------------------------------------+{code}
>  
> you will see that Major compaction will not be trigger until you run alter table compact
MAJOR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message