cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Roth (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
Date Sun, 05 Mar 2017 17:48:32 GMT
Benjamin Roth created CASSANDRA-13299:
-----------------------------------------

             Summary: Potential OOMs and lock contention in write path streams
                 Key: CASSANDRA-13299
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Benjamin Roth


I see a potential OOM, when a stream (e.g. repair) goes through the write path as it is with
MVs.

StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators and they again
produce mutations. So every partition creates a single mutation, which in case of (very) big
partitions can result in (very) big mutations. Those are created on heap and stay there until
they are processed.

I don't think it is necessary to create a single mutation for each partition. Why don't we
implement a PartitionUpdateGeneratorIterator that takes a UnfilteredRowIterator and a max
size and spits out PartitionUpdates to be used to create and apply mutations?
The max size should be something like min(reasonable_absolute_max_size, max_mutation_size,
commitlog_segment_size / 2). reasonable_absolute_max_size could be like 16M or sth.
A mutation shouldn't be too large as it also affects MV partition locking. As longer a MV
partition is locked during a stream, the higher chances are that WTE's occur during streams.
I could also imagine that a max number of updates per mutation regardless of size in bytes
could make sense to avoid lock contention.

Love to get feedback and suggestions, incl. naming suggestions.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message