cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ZhaoYang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
Date Mon, 21 Aug 2017 11:55:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134733#comment-16134733
] 

ZhaoYang edited comment on CASSANDRA-13299 at 8/21/17 11:54 AM:
----------------------------------------------------------------

[trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13299-trunk]
[dtest|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13299 ]

Changes:

1. Throttle by number of base unfiltered. default is 100. 
2. A pair of open/close range tombstone could have any number of unshadowed rows in between.
In the patch, when reaching the limit of each batch, if there is an open range-tombstone-mark,
it will generate a corresponding close marker for it. It's to avoid handling range-tombstone-mark
separately from row which costs 1 more read-before-write for each pair of markers. This also
help to reduce the impact of a large range tombstone.
3. Partition deletion is only applied on first mutation to avoid reading entire partition
more than once.


Note:
One partition deletion or a range deletion could cause huge number of view rows to be removed,
thus view mutation may fail to apply due to WTE or max_mutation_size, but it could be resolved
separately in CASSANDRA-12783. Here, I only address the issue of holding entire partition
into memory when repairing base with mv.


was (Author: jasonstack):
[trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13299-trunk]
[dtest|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13299 ]

Changes:

1. Throttle by number of base unfiltered. default is 100. 
2. A pair of open/close range tombstone could have any number of unshadowed rows in between.
In the patch, when reaching the limit of each batch, if there is an open range-tombstone-mark,
it will generate a corresponding close marker for it. 



Note:
One partition deletion or a range deletion could cause huge number of view rows to be removed,
thus view mutation may fail to apply due to WTE or max_mutation_size, but it could be resolved
separately in CASSANDRA-12783. Here, I only address the issue of holding entire partition
into memory when repairing base with mv.

> Potential OOMs and lock contention in write path streams
> --------------------------------------------------------
>
>                 Key: CASSANDRA-13299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benjamin Roth
>            Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write path as it
is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators and they
again produce mutations. So every partition creates a single mutation, which in case of (very)
big partitions can result in (very) big mutations. Those are created on heap and stay there
until they finished processing.
> I don't think it is necessary to create a single mutation for each partition. Why don't
we implement a PartitionUpdateGeneratorIterator that takes a UnfilteredRowIterator and a max
size and spits out PartitionUpdates to be used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, max_mutation_size,
commitlog_segment_size / 2). reasonable_absolute_max_size could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. The longer
a MV partition is locked during a stream, the higher chances are that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of size in
bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message