cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Haddad (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8868) JBOD Aware Size Tiered Compaction Strategy
Date Thu, 26 Feb 2015 18:06:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338814#comment-14338814
] 

Jon Haddad commented on CASSANDRA-8868:
---------------------------------------

Good point.  I'm curious as to which would be more beneficial.  The benefits to compaction
being continuous vs better failure recovery.  I suspect partitioning will benefit fat nodes
just as well as this strategy, but this strategy may be more optimal for disk utilization
& throughput.  

If CASSANDRA-6696 is being merged in & on by default we can close this.

> JBOD Aware Size Tiered Compaction Strategy
> ------------------------------------------
>
>                 Key: CASSANDRA-8868
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8868
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jon Haddad
>            Priority: Minor
>         Attachments: jbod_aware.png
>
>
> I'd like to propose a new compaction strategy targeting JBOD configurations.  I believe
this strategy would be most useful to machines with 4+ spinning disks but would also see some
benefit on SSDs.
> There are a several goals with this strategy: 
> 1. Minimize disk seeks during the compaction process
> 2. Maximize disk throughput when multiple disks are present
> 3. Better data distribution across disks.  Data should automatically be balanced (applies
when adding a new, empty disk)
> When a compaction is to occur, the algorithm first selects a disk to be the receiver.
 This disk should be the one with the most free space.  The disks with the least free space
should then be chosen as the origin disks.  SStables are selected from the origin disks, and
compacted to the receiver.  This should both minimize seeks as well as auto balance data across
disks.
> I'm not sure if this would apply to leveled compaction, but it may apply to date tiered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message