cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Eriksson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10540) RangeAwareCompaction
Date Thu, 22 Oct 2015 11:17:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968973#comment-14968973
] 

Marcus Eriksson commented on CASSANDRA-10540:
---------------------------------------------

Pushed a new branch [here|https://github.com/krummas/cassandra/commits/marcuse/rangeawarecompaction]
for early feedback, still needs cleanup and tests

Enable like this:
{code}ALTER TABLE x.y WITH compaction={'class':'LeveledCompactionStrategy', 'range_aware_compaction':'true',
'min_range_sstable_size_in_mb':'15'}{code}

* Run a compaction strategy instance per owned range (with num_tokens=256 and rf=3, we will
have 768 * 2 instances (repaired/unrepaired data)). 
* To avoid getting very many tiny sstables in the per-range strategies, we keep them outside
the strategy until the estimated size of a range-sstable is larger than {{'min_range_sstable_size_in_mb'}}.
([estimation|https://github.com/krummas/cassandra/blob/09c58eb4689230d471ef4319733fb0e85399bd3a/src/java/org/apache/cassandra/db/compaction/writers/RangeAwareCompactionWriter.java#L115]
usually gets within a few % of the actual value).
* We do STCS among the many-range-sstables (called "L0" which might not be optimal due to
LCS)
* We currently prioritize compaction in L0 to get sstables out of there as quickly as possible
* If an sstable fits within a range, it is added to that corresponding range-compaction strategy
- this should avoid getting a lot of L0 sstables after streaming for example
* Adds a {{describecompactionstrategy}} nodetool command which displays information about
the configured compaction strategy (like sstables per range etc). Example with only unrepaired
data and 2 data directories - we first split the owned ranges over those 2 directories, and
then we split on a per range basis, so the first RangeAwareCompactionStrategy is responsible
for half the data and the second one is responsible for the rest: {code} $ bin/nodetool describecompactionstrategy
keyspace1 standard1

-------------------------------------------------- keyspace1.standard1 --------------------------------------------------
Strategy=class org.apache.cassandra.db.compaction.RangeAwareCompactionStrategy, for 167 unrepaired
sstables, boundary tokens=min(-9223372036854775808) -> max(-4095785201827646), location=/home/marcuse/c/d1
Inner strategy: class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy (257
instances, 162 total sstables)
  sstable counts: 
            0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29
          ------------------------------------------------------------------------------------------
  0.. 29 |  1  3  0  0  2  3  0  3  3  0  3  0  2  1  0  1  0  1  0  3  3  4  1  0  3  1 
0  0  0  0
 30.. 59 |  0  0  0  3  0  2  2  0  3  0  3  3  0  1  3  3  3  0  2  0  1  2  0  0  0  1 
0  3  0  0
 60.. 89 |  1  0  0  1  1  1  1  0  1  0  2  3  1  0  3  1  2  3  2  0  0  3  2  1  1  0 
0  2  3  1
 90..119 |  0  1  2  0  0  3  0  3  3  1  0  0  3  0  2  0  2  0  2  1  3  0  2  1  1  3 
1  0  3  0
120..149 |  2  0  3  1  3  0  0  3  3  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
150..179 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
180..209 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
210..239 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
240..257 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
Strategy=class org.apache.cassandra.db.compaction.RangeAwareCompactionStrategy, for 221 unrepaired
sstables, boundary tokens=max(-4095785201827646) -> max(9223372036854775807), location=/var/lib/c1
Inner strategy: class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy (257
instances, 215 total sstables)
  sstable counts: 
            0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29
          ------------------------------------------------------------------------------------------
  0.. 29 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 30.. 59 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 60.. 89 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 90..119 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
120..149 |  0  0  0  0  0  0  0  0  0  0  1  6  0  0  3  0  3  0  3  3  3  3  1  0  1  0 
2  0  3  2
150..179 |  3  3  3  0  0  3  3  0  3  2  3  1  3  3  3  3  0  0  0  3  0  1  1  0  6  3 
3  0  3  3
180..209 |  0  1  1  3  1  3  1  3  3  2  3  3  0  3  0  3  1  0  0  1  2  3  0  0  1  1 
0  0  3  3
210..239 |  3  3  3  2  0  6  1  3  0  0  3  3  3  1  3  4  3  3  3  0  3  0  3  1  2  2 
0  2  0  0
240..257 |  1  0  3  1  0  3  3  0  0  0  0  0  0  3  3  0  0
Strategy=class org.apache.cassandra.db.compaction.RangeAwareCompactionStrategy, for 0 repaired
sstables, boundary tokens=min(-9223372036854775808) -> max(-4095785201827646), location=/home/marcuse/c/d1
Inner strategy: class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy (257
instances, 0 total sstables)
  sstable counts: 
            0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29
          ------------------------------------------------------------------------------------------
  0.. 29 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 30.. 59 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 60.. 89 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 90..119 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
120..149 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
150..179 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
180..209 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
210..239 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
240..257 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
Strategy=class org.apache.cassandra.db.compaction.RangeAwareCompactionStrategy, for 0 repaired
sstables, boundary tokens=max(-4095785201827646) -> max(9223372036854775807), location=/var/lib/c1
Inner strategy: class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy (257
instances, 0 total sstables)
  sstable counts: 
            0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29
          ------------------------------------------------------------------------------------------
  0.. 29 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 30.. 59 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 60.. 89 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
 90..119 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
120..149 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
150..179 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
180..209 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
210..239 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
0  0  0  0
240..257 |  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
{code}

Comments/ideas/worries? [~yukim], [~kohlisankalp], [~iamaleksey], [~jbellis], anyone?

> RangeAwareCompaction
> --------------------
>
>                 Key: CASSANDRA-10540
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10540
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>             Fix For: 3.2
>
>
> Broken out from CASSANDRA-6696, we should split sstables based on ranges during compaction.
> Requirements;
> * dont create tiny sstables - keep them bunched together until a single vnode is big
enough (configurable how big that is)
> * make it possible to run existing compaction strategies on the per-range sstables
> We should probably add a global compaction strategy parameter that states whether this
should be enabled or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message