cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ZhaoYang (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
Date Fri, 11 Aug 2017 06:02:00 GMT


ZhaoYang commented on CASSANDRA-13299:

[~brstgt] Hi benjamin, are you working on this ticket?

I think there isn't a perfect base mutation size or number of base rows in a mutation that
fits all data models.  Your suggested Min(16MB, max_mutation_size) should be good enough.

First target is to reduce memory pressure for huge partition with MV in repair. 

> Potential OOMs and lock contention in write path streams
> --------------------------------------------------------
>                 Key: CASSANDRA-13299
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benjamin Roth
> I see a potential OOM, when a stream (e.g. repair) goes through the write path as it
is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators and they
again produce mutations. So every partition creates a single mutation, which in case of (very)
big partitions can result in (very) big mutations. Those are created on heap and stay there
until they finished processing.
> I don't think it is necessary to create a single mutation for each partition. Why don't
we implement a PartitionUpdateGeneratorIterator that takes a UnfilteredRowIterator and a max
size and spits out PartitionUpdates to be used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, max_mutation_size,
commitlog_segment_size / 2). reasonable_absolute_max_size could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. The longer
a MV partition is locked during a stream, the higher chances are that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of size in
bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message