cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-2156) Compaction Throttling
Date Sun, 13 Feb 2011 01:06:57 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993827#comment-12993827
] 

Stu Hood edited comment on CASSANDRA-2156 at 2/13/11 1:05 AM:
--------------------------------------------------------------

Actually, this throttling probably needs to occur on the read side to properly account for
cases with lots of updates... on the write side, we might have compacted the data down by
32x for example.

EDIT: Oops... it is already read throttled.

      was (Author: stuhood):
    Actually, this throttling probably needs to occur on the read side to properly account
for cases with lots of updates... on the write side, we might have compacted the data down
by 32x for example.
  
> Compaction Throttling
> ---------------------
>
>                 Key: CASSANDRA-2156
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2156
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: for-0.6-0001-Throttle-compaction-to-a-fixed-throughput.txt, for-0.6-0002-Make-compaction-throttling-configurable.txt
>
>
> Compaction is currently relatively bursty: we compact as fast as we can, and then we
wait for the next compaction to be possible ("hurry up and wait").
> Instead, to properly amortize compaction, you'd like to compact exactly as fast as you
need to to keep the sstable count under control.
> For every new level of compaction, you need to increase the rate that you compact at:
a rule of thumb that we're testing on our clusters is to determine the maximum number of buckets
a node can support (aka, if the 15th bucket holds 750 GB, we're not going to have more than
15 buckets), and then multiply the flush throughput by the number of buckets to get a minimum
compaction throughput to maintain your sstable count.
> Full explanation: for a min compaction threshold of {{T}}, the bucket at level {{N}}
can contain {{SsubN = T^N}} 'units' (unit == memtable's worth of data on disk). Every time
a new unit is added, it has a {{1/SsubN}} chance of causing the bucket at level N to fill.
If the bucket at level N fills, it causes {{SsubN}} units to be compacted. So, for each active
level in your system you have {{SubN * 1 / SsubN}}, or {{1}} amortized unit to compact any
time a new unit is added.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message