cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5850) change gc_grace_seconds default to 28 days
Date Wed, 07 Aug 2013 18:08:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732236#comment-13732236
] 

Jonathan Ellis commented on CASSANDRA-5850:
-------------------------------------------

First, 10 days is not arbitrary.  10 was chosen to allow you to target weekly repairs, plus
a safety margin.

For every user who has trouble running repair weekly, there is at least one user who wishes
Cassandra would reclaim his disk space from tombstones faster.  So there is a balance to strike,
although people with append-only workloads tend not to notice the second part.
                
> change gc_grace_seconds default to 28 days
> ------------------------------------------
>
>                 Key: CASSANDRA-5850
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5850
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 2.0 beta 2
>            Reporter: Robert Coli
>            Priority: Trivial
>         Attachments: gc_grace_seconds_to_2419200_seconds_aka_28_days.patch
>
>
> Current default for gc_grace_seconds is 10 days. Attached patch changes all instances
of this 10 day default to 28 days. 
> Rationale :
> - 10 days is arbitrary, there is nothing special about the current value
> - human societies do not operate on cycles which are a multiple of 10 days, they operate
on a cycle of 7 day weeks
> - operators must run repair once every gc_grace_seconds, and with typical data sizes
(and compaction/streaming throttling) this might run for a significant fraction of 10 days
> - repair often fails, and detecting and working around that failure might also take a
significant fraction of 10 days
> - repair is the heaviest operation one can run on a cassandra cluster and operators are
therefore motivated to run it ~3x less frequently by default
> - the worst case impact is keeping data around for 18 days longer than the previous default,
and this only occurs in CFs which actually take DELETE operation
> - 28 days is an even multiple of 7 days and easily comprehensible as a default time in
which to schedule repair

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message