hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huaxiang sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17241) Avoid compacting already compacted mob files with _del files
Date Mon, 05 Dec 2016 15:25:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15722533#comment-15722533
] 

huaxiang sun commented on HBASE-17241:
--------------------------------------

Hi [~anoop.hbase], per my understanding, this is the current logic when all files's size is
larger than the threshold. In compactMobFiles(), if the partitions is empty, it will skip
the rest of logic and return. Yeah, this can optimized a bit. 

> Avoid compacting already compacted  mob files with _del files
> -------------------------------------------------------------
>
>                 Key: HBASE-17241
>                 URL: https://issues.apache.org/jira/browse/HBASE-17241
>             Project: HBase
>          Issue Type: Improvement
>          Components: mob
>    Affects Versions: 2.0.0
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17241-master-002.patch, HBASE-17241-master-003.patch, HBASE-17241.master.001.patch
>
>
> Today if there is only one file in the partition, and there is no _del files, the file
is skipped. With del file, the current logic is to compact the already-compacted file with
_del file. Let's say there is one mob file regionA20161101***, which was compacted. On 12/1/2016,
there is _del file regionB20161201**_del, mob compaction kicks in, regionA20161101*** is less
than the threshold, and it is picked for compaction. Since there is a _del file, regionA20161101****
and regionB20161201***_del are compacted into regionA20161101**_1 . After that, regionB20161201**_del
cannot be deleted since it is not a allFile compaction. The next mob compaction, regionA20161101**_1
and regionB20161201**_del will be picked up again and be compacted into regionA20161101***_2.
So in this case, it will cause more unnecessary IOs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message