hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18084) Improve CleanerChore to clean from directory which consumes more disk space
Date Sun, 21 May 2017 12:39:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018801#comment-16018801
] 

Yu Li commented on HBASE-18084:
-------------------------------

bq. if the initial batch contains large directory
But what if not sir?

Let me say more about my case. The current clean logic uses depth-first algo, while the archive
dir hierarchical like:
{noformat}
/hbase/archive/data
- namespace
  - table
    - region
      - CF
        - files
{noformat}
And while we reach one leaf directory and get the file list in it and cleaning, flush is still
ongoing and the new files will be included when we iterate the other directory later.

In our case the output of "hadoop fs -count" order by space usage (descending) is like:
{noformat}
        2043       686999    770527133663895 /hbase/archive/data/default/pora_6_feature_queue
        2049      3430815    470358930247550 /hbase/archive/data/default/pora_6_feature
       17101       704476    100740814980772 /hbase/archive/data/default/mainv3_ic
       14251       495293     79161730247206 /hbase/archive/data/default/mainv3_main_result_b
       14251       893144     71121202187220 /hbase/archive/data/default/mainv3_main_result_a
        2045        79223     51098022268522 /hbase/archive/data/default/pora_log_wireless_search_item_pv_queue
        2001       123332     49075201291122 /hbase/archive/data/default/mainv3_main_askr_queue_a
        2001        65030     45649351359151 /hbase/archive/data/default/mainv3_main_askr_queue_b
{noformat}
And we have many directories like
{noformat}
          13            6             173403 /hbase/archive/data/default/b2b-et2mainse_tisplus_tisplus_IdleFishPool_askr
           3            1             253497 /hbase/archive/data/default/b2b-et2mainse_tisplus_tisplus_buyoffer_searcher_askr
          17           17           15635421 /hbase/archive/data/default/b2b-et2mainse_tisplus_tisplus_cloud_wukuang_askr
          13            6           56062313 /hbase/archive/data/default/b2b-et2mainse_tisplus_tisplus_common_search_askr
           5            2            1165298 /hbase/archive/data/default/b2b-et2mainse_tisplus_tisplus_company_askr
          11            9            1196774 /hbase/archive/data/default/b2b-et2mainse_tisplus_tisplus_content_search_askr
{noformat}
So the largest 3 directories took 1.3PB while the whole archive directory took 1.8PB, and
the largest directory name starts with "p". If we use the greedy algorithm, we may choose
{{mainv3_main_askr_queue_a}} which has 123k files to clean, while {{pora_6_feature_queue}}
is still flushing with speed. And the worst case is we cannot reach the largest dir in a long
time.

And I agree that depends on the real case, but in our case the simple method in current patch
could works well, while I'm not sure whether the new approach suggested will do (smile).

> Improve CleanerChore to clean from directory which consumes more disk space
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-18084
>                 URL: https://issues.apache.org/jira/browse/HBASE-18084
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-18084.patch, HBASE-18084.v2.patch
>
>
> Currently CleanerChore cleans the directory in dictionary order, rather than from the
directory with largest space usage. And when data abnormally accumulated to some huge volume
in archive directory, the cleaning speed might not be enough.
> This proposal is another improvement working together with HBASE-18083 to resolve our
online issue (archive dir consumed more than 1.8PB SSD space)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message