cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Deng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13162) Batchlog replay is throttled during bootstrap, creating conditions for incorrect query results on materialized views
Date Fri, 27 Jan 2017 18:55:24 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wei Deng updated CASSANDRA-13162:
---------------------------------
    Priority: Critical  (was: Major)

> Batchlog replay is throttled during bootstrap, creating conditions for incorrect query
results on materialized views
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13162
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13162
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Wei Deng
>            Priority: Critical
>
> I've tested this in a C* 3.0 cluster with a couple of Materialized Views defined (one
base table and two MVs on that base table). The data volume is not very high per node (about
80GB of data per node total, and that particular base table has about 25GB of data uncompressed
with one MV taking 18GB compressed and the other MV taking 3GB), and the cluster is using
decent hardware (EC2 C4.8XL with 18 cores + 60GB RAM + 18K IOPS RAID0 from two 3TB gp2 EBS
volumes). 
> This is originally a 9-node cluster. It appears that after adding 3 more nodes to the
DC, the system.batches table accumulated a lot of data on the 3 new nodes, and in the subsequent
week the batchlog on the 3 new nodes got slowly replayed back to the rest of the nodes in
the cluster. The bottleneck seems to be the throttling defined in this cassandra.yaml setting:
batchlog_replay_throttle_in_kb, which by default is set to 1MB/s.
> Given that it is taking almost a week (and still hasn't finished) for the batchlog (from
MV) to be replayed after the boostrap finishes, it seems only reasonable to unthrottle (or
at least give it a much higher throttle rate) during the initial bootstrap, and hence I'd
consider this a bug for our current MV implementation.
> Also as far as I understand, the bootstrap logic won't wait for the backlogged batchlog
to be fully replayed before changing the new bootstrapping node to "UN" state, and if batchlog
for the MVs got stuck in this state for a long time, we basically will get wrong answers on
the MVs during that whole duration (until batchlog is fully played to the cluster), which
adds even more criticality to this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message