cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Roth (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-13299) Potential OOMs and lock contention in write path streams
Date Sun, 05 Mar 2017 17:48:32 GMT
Benjamin Roth created CASSANDRA-13299:

             Summary: Potential OOMs and lock contention in write path streams
                 Key: CASSANDRA-13299
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Benjamin Roth

I see a potential OOM, when a stream (e.g. repair) goes through the write path as it is with

StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators and they again
produce mutations. So every partition creates a single mutation, which in case of (very) big
partitions can result in (very) big mutations. Those are created on heap and stay there until
they are processed.

I don't think it is necessary to create a single mutation for each partition. Why don't we
implement a PartitionUpdateGeneratorIterator that takes a UnfilteredRowIterator and a max
size and spits out PartitionUpdates to be used to create and apply mutations?
The max size should be something like min(reasonable_absolute_max_size, max_mutation_size,
commitlog_segment_size / 2). reasonable_absolute_max_size could be like 16M or sth.
A mutation shouldn't be too large as it also affects MV partition locking. As longer a MV
partition is locked during a stream, the higher chances are that WTE's occur during streams.
I could also imagine that a max number of updates per mutation regardless of size in bytes
could make sense to avoid lock contention.

Love to get feedback and suggestions, incl. naming suggestions.

This message was sent by Atlassian JIRA

View raw message