cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-8639) Can OOM on CL replay with dense mutations
Date Tue, 01 Dec 2015 18:19:11 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034214#comment-15034214
] 

Ariel Weisberg edited comment on CASSANDRA-8639 at 12/1/15 6:18 PM:
--------------------------------------------------------------------

{quote}
This isn't related to the ticket but maybe we should fix it as well; I don't see anyplace
we wait for the replay futures to complete before we finish recover().
Both 2.1 code and your patch will exit early before the futures have all finished. It looks
like the old version only waited when there were more than max outstanding mutations. Which
is also wrong and racy. We should always wait for the queue to drain completely before the
method exits.
{quote}
Are you talking about {{[CommitLogReplayer.blockForWrites()|https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L98]}}
which is invoked from {{[CommitLog.recover()|https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L145]}}?
I think that is already covered. I can refactor it if you want and have {{CommitLogReplayer.recover()}}
return until everything is done anyways. Seems like a safe change unless it turns out to deadlock
some weird test code somewhere.

bq. I'm not sure why futures was changed to a deque. looks like you only use queue methods,
but maybe I missed it?
It's just a habit for this kind of asynchronous result queue. Muscle memory writes a loop
that prioritizes consuming result futures over submitting new work in order to minimize latency,
working set size, and temporal cache locality. To do that you need to be able to {{poll()}}
a queue and {{List}} doesn't let you do that. 

The other issue with doing 1k and then draining 1k is that there is a period where the number
of tasks in flight is small because the producer is waiting for the stragglers from the last
1k to arrive before issuing new work. This can starve consumers. This version keeps a targeted
number of pending mutations/mutation bytes in flight at any given time.

bq. The only other thing I noticed was in the test you should validate the data test data
is not found after you clear the CF in-case the replay isn't working.
[Does clearing the data before replay|https://github.com/apache/cassandra/compare/cassandra-2.1...aweisberg:CASSANDRA-8639-2.1?expand=1#diff-92ffe896212dc94b91ad86349f0647abR141]
and [then checking for it afterwards accomplish that?|https://github.com/apache/cassandra/compare/cassandra-2.1...aweisberg:CASSANDRA-8639-2.1?expand=1#diff-92ffe896212dc94b91ad86349f0647abR183]
Unless you think that {{clearUnsafe()}} might not work it seems sufficient.

bq. You also have a 2.1 utest failure related to CL not sure if that's related. org.apache.cassandra.cql3.DropKeyspaceCommitLogRecycleTest.testRecycle
[That is a pretty flakey test.|http://cassci.datastax.com/view/cassandra-2.1/job/cassandra-2.1_testall/271/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/history/]

bq. And one dtest failure in 2.1 commitlog_test.TestCommitLog.test_bad_crc
[Looks like another failing/unreliable test.|http://cassci.datastax.com/view/cassandra-2.1/job/cassandra-2.1_dtest/361/#showFailuresLink]




was (Author: aweisberg):
bq. This isn't related to the ticket but maybe we should fix it as well; I don't see anyplace
we wait for the replay futures to complete before we finish recover().
Both 2.1 code and your patch will exit early before the futures have all finished. It looks
like the old version only waited when there were more than max outstanding mutations. Which
is also wrong and racy. We should always wait for the queue to drain completely before the
method exits.
Are you talking about {{[CommitLogReplayer.blockForWrites()|https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L98]}}
which is invoked from {{[CommitLog.recover()|https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L145]}}?
I think that is already covered. I can refactor it if you want and have {{CommitLogReplayer.recover()}}
return until everything is done anyways. Seems like a safe change unless it turns out to deadlock
some weird test code somewhere.

bq. I'm not sure why futures was changed to a deque. looks like you only use queue methods,
but maybe I missed it?
It's just a habit for this kind of asynchronous result queue. Muscle memory writes a loop
that prioritizes consuming result futures over submitting new work in order to minimize latency,
working set size, and temporal cache locality. To do that you need to be able to {{poll()}}
a queue and {{List}} doesn't let you do that. 

The other issue with doing 1k and then draining 1k is that there is a period where the number
of tasks in flight is small because the producer is waiting for the stragglers from the last
1k to arrive before issuing new work. This can starve consumers. This version keeps a targeted
number of pending mutations/mutation bytes in flight at any given time.

bq. The only other thing I noticed was in the test you should validate the data test data
is not found after you clear the CF in-case the replay isn't working.
[Does clearing the data before replay|https://github.com/apache/cassandra/compare/cassandra-2.1...aweisberg:CASSANDRA-8639-2.1?expand=1#diff-92ffe896212dc94b91ad86349f0647abR141]
and [then checking for it afterwards accomplish that?|https://github.com/apache/cassandra/compare/cassandra-2.1...aweisberg:CASSANDRA-8639-2.1?expand=1#diff-92ffe896212dc94b91ad86349f0647abR183]
Unless you think that {{clearUnsafe()}} might not work it seems sufficient.

bq. You also have a 2.1 utest failure related to CL not sure if that's related. org.apache.cassandra.cql3.DropKeyspaceCommitLogRecycleTest.testRecycle
[That is a pretty flakey test.|http://cassci.datastax.com/view/cassandra-2.1/job/cassandra-2.1_testall/271/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/history/]

bq. And one dtest failure in 2.1 commitlog_test.TestCommitLog.test_bad_crc
[Looks like another failing/unreliable test.|http://cassci.datastax.com/view/cassandra-2.1/job/cassandra-2.1_dtest/361/#showFailuresLink]



> Can OOM on CL replay with dense mutations
> -----------------------------------------
>
>                 Key: CASSANDRA-8639
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8639
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: T Jake Luciani
>            Assignee: Ariel Weisberg
>            Priority: Minor
>             Fix For: 2.1.x
>
>
> If you write dense mutations with many clustering keys, the replay of the CL can quickly
overwhelm a node on startup.  This looks to be caused by the fact we only ensure there are
1000 mutations in flight at a time. but those mutations could have thousands of cells in them.
> A better approach would be to limit the CL replay to the amount of memory in flight using
cell.unsharedHeapSize()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message