cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes
Date Thu, 30 Nov 2017 01:05:43 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267272#comment-16267272
] 

Paulo Motta edited comment on CASSANDRA-13948 at 11/30/17 1:00 AM:
-------------------------------------------------------------------

Thanks for the review!

After rebasing this on top of CASSANDRA-13215 and addressing your latest comments, I noticed
a few things which could be improved and did the following updates:
* Since blacklisting a directory will refresh the disk boundaries, we only need to reload
strategies when the disk boundary changes or the table parameters change. To avoid equals
comparison every time we call {{maybeReload}}, I moved the {{isOutOfDate}} check from the
{{DiskBoundaryManager}} to the {{DiskBoundaries}} object - which is invalidated when there
are any boundary changes. ([commit|https://github.com/pauloricardomg/cassandra/commit/662cd063ca2e1c382ba3cd5dc8032b0d3f12683c])
* I thought that it no longer makes sense to expose the compaction strategy index to outside
the compaction strategy manager since it's possible to get the correct disk placement directly
from {{CFS.getDiskBoundaries}}. This should prevent races when the {{CompactionStrategyManager}}
reloads boundaries between successive calls to {{CSM.getCompactionStrategyIndex}}. [This commit|https://github.com/pauloricardomg/cassandra/commit/abd1340b000d4596d71f00e5de8507de967ee7a5]
updates {{relocatesstables}} and {{scrub}} to use {{CFS.getDiskBoundaries}} instead, and make
{{CSM.getCompactionStrategyIndex}} private.
* I found it a bit hard to reason about when to use {{maybeReload}} to write the documentation
and made its use consistent across {{CompactionStrategyManager}} on [this commit|https://github.com/pauloricardomg/cassandra/commit/c0926e99edb1ffdcda16640eda6faf8e78da9e46])
(as you suggested before) along with the documentation. I kept the previous call to {{maybeReload}}
from {{ColumnFamilyStore.reload}}, but we could probably avoid this and make {{maybeReload}}
private-only as this is being called on pretty much every operation.

It feels like we can simplify this and get rid of these locks altogether (or at least greatly
reduce their scope) by encapsulating the disk boundaries and compaction strategies in an immutable
object accessed with an atomic reference and pessimistically cancel any tasks with an old
placement when the strategies are reloaded. This is a significant refactor of {{CompactionStrategyManager}}
so we should probably do it another ticket.

I submitted internal CI with the [latest branch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948]
and will post the results here when ready. I will create a trunk version after this follow-up
is reviewed.


was (Author: pauloricardomg):
Thanks for the review!

After rebasing this on top of CASSANDRA-13215 and addressing your latest comments, I noticed
a few things which could be improved and did the following updates:
* Since blacklisting a directory will refresh the disk boundaries, we only need to reload
strategies when the disk boundary changes or the table parameters change. To avoid equals
comparison every time we call {{maybeReload}}, I moved the {{isOutOfDate}} check from the
{{DiskBoundaryManager}} to the {{DiskBoundaries}} object - which is invalidated when there
are any boundary changes. ([commit|https://github.com/pauloricardomg/cassandra/commit/662cd063ca2e1c382ba3cd5dc8032b0d3f12683c])
* I thought that it no longer makes sense to expose the compaction strategy index to outside
the compaction strategy manager since it's possible to get the correct disk placement directly
from {{CFS.getDiskBoundaries}}. This should prevent races when the {{CompactionStrategyManager}}
reloads boundaries between successive calls to {{CSM.getCompactionStrategyIndex}}. [This commit|https://github.com/pauloricardomg/cassandra/commit/abd1340b000d4596d71f00e5de8507de967ee7a5]
updates {{relocatesstables}} and {{scrub}} to use {{CFS.getDiskBoundaries}} instead, and make
{{CSM.getCompactionStrategyIndex}} private.
* I found it a bit hard to reason about when to use {{maybeReload}} to write the documentation
and made its use consistent across {{CompactionStrategyManager}} on [this commit|https://github.com/pauloricardomg/cassandra/commit/8518d6c4f001641da36d6fd58474ed3b50476326])
(as you suggested before) along with the documentation. I kept the previous call to {{maybeReload}}
from {{ColumnFamilyStore.reload}}, but we could probably avoid this and make {{maybeReload}}
private-only as this is being called on pretty much every operation.

It feels like we can simplify this and get rid of these locks altogether (or at least greatly
reduce their scope) by encapsulating the disk boundaries and compaction strategies in an immutable
object accessed with an atomic reference and pessimistically cancel any tasks with an old
placement when the strategies are reloaded. This is a significant refactor of {{CompactionStrategyManager}}
so we should probably do it another ticket.

I submitted internal CI with the [latest branch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948]
and will post the results here when ready. I will create a trunk version after this follow-up
is reviewed.

> Reload compaction strategies when JBOD disk boundary changes
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-13948
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>             Fix For: 3.11.x, 4.x
>
>         Attachments: debug.log, dtest13948.png, dtest2.png, threaddump-cleanup.txt, threaddump.txt,
trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the {{IndexSummaryRedistribution}}
and {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled
frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1,
line=836 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, line=1199
(Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, line=943
(Compiled frame)
>  - org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
java.lang.Object) @bci=53, line=555 (Interpreted frame)
>  - org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection,
java.util.Collection, org.apache.cassandra.db.compaction.OperationType, java.lang.Throwable)
@bci=50, line=409 (Interpreted frame)
>  - org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable)
@bci=157, line=227 (Interpreted frame)
>  - org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable)
@bci=61, line=116 (Compiled frame)
>  - org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit()
@bci=2, line=200 (Interpreted frame)
>  - org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish()
@bci=5, line=185 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries()
@bci=559, line=130 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
@bci=9, line=1420 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
@bci=4, line=250 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() @bci=30,
line=228 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() @bci=4, line=125
(Interpreted frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 (Interpreted frame)
>  - org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run()
@bci=4, line=118 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame)
>  - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Compiled frame)
>  - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
@bci=1, line=180 (Compiled frame)
>  - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() @bci=37,
line=294 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Interpreted
frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
@bci=1, line=81 (Interpreted frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$8.run() @bci=4 (Interpreted
frame)
>  - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
> {noformat}
> {noformat}
> Thread 94573: (state = IN_JAVA)
>  - java.util.HashMap$HashIterator.nextNode() @bci=95, line=1441 (Compiled frame; information
may be imprecise)
>  - java.util.HashMap$KeyIterator.next() @bci=1, line=1461 (Compiled frame)
>  - org.apache.cassandra.db.lifecycle.View$3.apply(org.apache.cassandra.db.lifecycle.View)
@bci=20, line=268 (Compiled frame)
>  - org.apache.cassandra.db.lifecycle.View$3.apply(java.lang.Object) @bci=5, line=265
(Compiled frame)
>  - org.apache.cassandra.db.lifecycle.Tracker.apply(com.google.common.base.Predicate,
com.google.common.base.Function) @bci=13, line=133 (Compiled frame)
>  - org.apache.cassandra.db.lifecycle.Tracker.tryModify(java.lang.Iterable, org.apache.cassandra.db.compaction.OperationType)
@bci=31, line=99 (Compiled frame)
>  - org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(int)
@bci=84, line=139 (Compiled frame)
>  - org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(int)
@bci=105, line=119 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run()
@bci=84, line=265 (Interpreted frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame)
>  - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Interpreted
frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
@bci=1, line=81 (Interpreted frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$8.run() @bci=4 (Interpreted
frame)
>  - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
> {noformat}
> This particular node remain in this state forever, indicating {{LeveledCompactionStrategyTask.getNextBackgroundTask}}
was looping indefinitely.
> What happened is that sstable references were replaced on the tracker by the {{IndexSummaryRedistribution}}
thread, so the {{AbstractCompactionStrategy.getNextBackgroundTask}} could not create the transaction
with the old references, and the {{IndexSummaryRedistribution}} could not update the sstable
reference in the compaction strategy because {{AbstractCompactionStrategy.getNextBackgroundTask}}
was holding the {{CompactionStrategyManager}} lock.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message