hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elliott Clark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19646) Add CRON To Major Compaction
Date Wed, 27 Dec 2017 20:22:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304785#comment-16304785

Elliott Clark commented on HBASE-19646:

Jitter in a production system is one of the most crucial design components to ensure even
response times, and reduce spiky resource usage for distributed systems. We should not suggest
configurations that remove this functionality. From experience having no jitter is the cause
a significant issues in many many different HBase user's deployments.

Compactions (including major compactions) should always be an optimization that most users
should never know about. The default configuration of the system should endeavor to make sure
that compactions happen in the background at a reasonable frequency. To that end we have chosen
1 week (seven days) as a amount of time between forced compactions. It's a pretty reasonable
thing to spread compactions out over a large period of days, for most workloads.

If someone has more customized knowledge and business needs then we should give them the ability
to script what they want. There's no reason to rebuild cron in HBase. Almost all the systems
that HBase can be run on already have crond installed.

> Add CRON To Major Compaction
> ----------------------------
>                 Key: HBASE-19646
>                 URL: https://issues.apache.org/jira/browse/HBASE-19646
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: BELUGA BEHR
>            Priority: Minor
> HBase provides _hbase.hregion.majorcompaction_ 
> {quote}
> Time between major compactions, expressed in milliseconds. Set to 0 to disable time-based
automatic major compactions. User-requested and size-based major compactions will still run.
This value is multiplied by hbase.hregion.majorcompaction.jitter to cause compaction to start
at a somewhat-random time during a given window of time. The default value is 7 days, expressed
in milliseconds. If major compactions are causing disruption in your environment, you can
configure them to run at off-peak times for your deployment, or disable time-based major compactions
by setting this parameter to 0, and run major compactions in a cron job or by another external
> {quote}
> Instead of this existing mechanism, that adds randomness into a production system (ugh),
let's simply allow users to specify a cron string and replace this simple periodic (+jitter)
scheduling mechanism.  CRON is useful for systems that have known windows of time (i.e. weekend,
nights) that are known to be good times for compaction.

This message was sent by Atlassian JIRA

View raw message