cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-7386) JBOD threshold to prevent unbalanced disk utilization
Date Tue, 17 Jun 2014 18:33:08 GMT


Robert Stupp updated CASSANDRA-7386:

    Attachment: 7386v2.diff

Here's a working version of the patch.

It adds new metrics to each data directory:
* {{readTasks}} counts the read requests
* {{writeTasks}} counts the write requests
* {{writeValue*}} exposes the "write value" for each data directory for mean, one/five/fifteen

The data directory with the highest "write value" is chosen for new sstables.

"Write value" is calculated using the formula:
{{freeRatio / weightedRate}} where {{freeRatio = availableBytes / totalBytes}} and {{weightedRate
= writeRate + readRate / 2}}. "divide by 2" has been randomly chosen since not every read
operation hits the disks.

{{readRate}} is taken from {{SSTableReader.incrementReadCount()}} but I had to add a call
to {{incrementReadCount()}} to some classes in code. I did not add it to {{RandomAccessReader}}
or {{SegmentedFile}} because this patch should not influence performance too much.

I did not experience much with the formula but created a sheet ({{Mappe1.ods}}) that shows
the "write value" in a matrix if freeRatio vs. weightedRate.

I've run {{cassandra-stress}} against (a single node, single data-directory) C* instance and
saw that the writeValue behaves as expected.

But that's only half the battle. The patch has to be verified in a "real production like cluster".
"weight value" needs to be compared with {{iostat}}, {{df}} etc. Is there any possibility
to do that?

> JBOD threshold to prevent unbalanced disk utilization
> -----------------------------------------------------
>                 Key: CASSANDRA-7386
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Lohfink
>            Priority: Minor
>         Attachments: 7386-v1.patch, 7386v2.diff, Mappe1.ods, patch_2_1_branch_proto.diff
> Currently the pick the disks are picked first by number of current tasks, then by free
space.  This helps with performance but can lead to large differences in utilization in some
(unlikely but possible) scenarios.  Ive seen 55% to 10% and heard reports of 90% to 10% on
IRC.  With both LCS and STCS (although my suspicion is that STCS makes it worse since harder
to be balanced).
> I purpose the algorithm change a little to have some maximum range of utilization where
it will pick by free space over load (acknowledging it can be slower).  So if a disk A is
30% full and disk B is 5% full it will never pick A over B until it balances out.

This message was sent by Atlassian JIRA

View raw message