hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16981) Expand Mob Compaction Partition policy from daily to weekly, monthly and beyond
Date Tue, 15 Nov 2016 18:01:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15667813#comment-15667813
] 

Anoop Sam John commented on HBASE-16981:
----------------------------------------

Just to make sure my thinking is correct.
Say we have daily frequency of MOB compaction and partition also day wise as of now.  Now
we change it to be monthly.   Now every day the MOB compaction happen and on day one per region
one file was made.  So like that there are many files for many region.  Next day also compaction
happens and as the partition is monthly,  it will consider yesterday's bigger file and all
small files of today.   Again 3rd day yday's bigger compacted file and today's small files..
And so on..   The IO increase is much more and that increases every day till we reach month
end.. End of the month only one file per region. (?)
So if our aim is only less number of files, can we think of doing staged compactions? (I dont
know whether it is correct name)  What am thinking is per day (consider freq as day) compaction
happens to single file. And this way continue for one week.  Each day handle that days files
alone.. End of the week, the second stage happens that is 7 days (6 days compacted files+
today's files)  files getting compacted to one.  Like this way end of the month all previous
week's one one file and this week's file and then this is working as a 2nd stage and compact
into single file for the month.  Like that may be at year end also.. Just crazy thinking/.
No analysis wrt code and all done at all.. And not sure abt the possibility /complexity..
Just throwing it here ..  Just wanted to reduce the IO amplification.  Am I saying my mind
correctly?



> Expand Mob Compaction Partition policy from daily to weekly, monthly and beyond
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-16981
>                 URL: https://issues.apache.org/jira/browse/HBASE-16981
>             Project: HBase
>          Issue Type: New Feature
>          Components: mob
>    Affects Versions: 2.0.0
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>         Attachments: HBASE-16981.master.001.patch, HBASE-16981.master.002.patch, Supportingweeklyandmonthlymobcompactionpartitionpolicyinhbase.pdf
>
>
> Today the mob region holds all mob files for all regions. With daily partition mob compaction
policy, after major mob compaction, there is still one file per region daily. Given there
is 365 days in one year, at least 365 files per region. Since HDFS has limitation for number
of files under one folder, this is not going to scale if there are lots of regions. To reduce
mob file number,  we want to introduce other partition policies such as weekly, monthly to
compact mob files within one week or month into one file. This jira is create to track this
effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message