cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lyuben Todorov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization
Date Thu, 14 Aug 2014 14:40:13 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lyuben Todorov updated CASSANDRA-7386:
--------------------------------------

    Attachment: sstable-count-second-run.png
                mean-writevalue-7disks.png

After loading 750GB of data into a 7 disk cluster the test shows what we expected, the drives
that had data written to them previously do indeed have lower Write values and these values
increase as the rest of the drives begin filling up (view graph). 

I also tracked sstable creation and compaction. To sum it up, on the first run only 2 drives
were used, on the second run another five were added. At the end of the first run d1 and d2
(disk1 and disk2 respectively) were saturated with sstables, d1 had 357 and d2 318. When the
second run was started another ks, second_run, was created and an additional 5 disks were
used in the node. A majority of sstables were sent to the five new directories as expected
(view piechart). 

The second test to see if busy disks are used for new tables is coming up.

> JBOD threshold to prevent unbalanced disk utilization
> -----------------------------------------------------
>
>                 Key: CASSANDRA-7386
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7386
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Lohfink
>            Assignee: Lyuben Todorov
>            Priority: Minor
>             Fix For: 2.1.1
>
>         Attachments: 7386-v1.patch, 7386v2.diff, Mappe1.ods, mean-writevalue-7disks.png,
patch_2_1_branch_proto.diff, sstable-count-second-run.png
>
>
> Currently the pick the disks are picked first by number of current tasks, then by free
space.  This helps with performance but can lead to large differences in utilization in some
(unlikely but possible) scenarios.  Ive seen 55% to 10% and heard reports of 90% to 10% on
IRC.  With both LCS and STCS (although my suspicion is that STCS makes it worse since harder
to be balanced).
> I purpose the algorithm change a little to have some maximum range of utilization where
it will pick by free space over load (acknowledging it can be slower).  So if a disk A is
30% full and disk B is 5% full it will never pick A over B until it balances out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message