hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingcheng Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15381) Implement a distributed MOB compaction by procedure
Date Fri, 08 Apr 2016 02:57:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231546#comment-15231546
] 

Jingcheng Du commented on HBASE-15381:
--------------------------------------

Thanks [~tedyu].
Sweep tool is a tool, and distributed mob compaction is a compaction mechanism that runs periodically
in the cluster.
Sweep tool uses MapReduce, it is distributed to RSs by mapper and reducer, and this tool scans
all the mob table and merges the linked small files. ( cells in HBase -> mob files).
Distributed mob compaction uses procedure, and distributed to RSs by procedure too. It directly
handles the mob files and merges the small files into bigger ones, at last adds the new reference
cells back to hbase by bulk load. (mob files -> cells in HBase).
Sweeper is "cells in HBase" -> mob files, and mob compaction is "mob files -> cells
in HBase), two different directions.

> Implement a distributed MOB compaction by procedure
> ---------------------------------------------------
>
>                 Key: HBASE-15381
>                 URL: https://issues.apache.org/jira/browse/HBASE-15381
>             Project: HBase
>          Issue Type: Improvement
>          Components: mob
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: mob distributed compaction design.pdf
>
>
> In MOB, there is a periodical compaction which runs in HMaster (It can be disabled by
configuration), some small mob files are merged into bigger ones. Now the compaction only
runs in HMaster which is not efficient and might impact the running of HMaster. In this JIRA,
a distributed MOB compaction is introduced, it is triggered by HMaster, but all the compaction
jobs are distributed to HRegionServers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message