cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12268) Make MV Index creation robust for wide referent rows
Date Wed, 10 Aug 2016 18:52:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415777#comment-15415777
] 

T Jake Luciani commented on CASSANDRA-12268:
--------------------------------------------

I think you need to restart those tests since they didn't run for 3.0 but the trunk version
has [View test failures|https://cassci.datastax.com/view/Dev/view/carlyeks/job/carlyeks-ticket-12268-testall/lastCompletedBuild/testReport/]

> Make MV Index creation robust for wide referent rows
> ----------------------------------------------------
>
>                 Key: CASSANDRA-12268
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12268
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Shook
>            Assignee: Carl Yeksigian
>             Fix For: 3.0.x, 3.x
>
>         Attachments: 12268.py
>
>
> When creating an index for a materialized view for extant data, heap pressure is very
dependent on the cardinality of of rows associated with each index value. With the way that
per-index value rows are created within the index, this can cause unbounded heap pressure,
which can cause OOM. This appears to be a side-effect of how each index row is applied atomically
as with batches.
> The commit logs can accumulate enough during the process to prevent the node from being
restarted. Given that this occurs during global index creation, this can happen on multiple
nodes, making stable recovery of a node set difficult, as co-replicas become unavailable to
assist in back-filling data from commitlogs.
> While it is understandable that you want to avoid having relatively wide rows  even in
materialized views, this represents a particularly difficult scenario for triage.
> The basic recommendation for improving this is to sub-group the index creation into smaller
chunks internally, providing a maximal bound against the heap pressure when it is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message