cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-14210) Optimize SSTables upgrade task scheduling
Date Tue, 13 Feb 2018 04:37:02 GMT


Jeff Jirsa commented on CASSANDRA-14210:

[~krummas] - you're probably best equipped to review, any chance you're interested? 

> Optimize SSTables upgrade task scheduling
> -----------------------------------------
>                 Key: CASSANDRA-14210
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Oleksandr Shulgin
>            Assignee: Kurt Greaves
>            Priority: Major
>             Fix For: 4.x
> When starting the SSTable-rewrite process by running {{nodetool upgradesstables --jobs
N}}, with N > 1, not all of the provided N slots are used.
> For example, we were testing with {{concurrent_compactors=5}} and {{N=4}}.  What we observed
both for version 2.2 and 3.0, is that initially all 4 provided slots are used for "Upgrade
sstables" compactions, but later when some of the 4 tasks are finished, no new tasks are scheduled
immediately.  It takes the last of the 4 tasks to finish before new 4 tasks would be scheduled.
 This happens on every node we've observed.
> This doesn't utilize available resources to the full extent allowed by the --jobs N parameter.
 In the field, on a cluster of 12 nodes with 4-5 TiB data each, we've seen that the whole
process was taking more than 7 days, instead of estimated 1.5-2 days (provided there would
be close to full N slots utilization).
> Instead, new tasks should be scheduled as soon as there is a free compaction slot.
> Additionally, starting from the biggest SSTables could further reduce the total time
required for the whole process to finish on any given node.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message