cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12905) Retry acquire MV lock on failure instead of throwing WTE on streaming
Date Wed, 14 Dec 2016 01:38:59 GMT


Paulo Motta commented on CASSANDRA-12905:

Updated 3.0 and 3.X patches with following changes:
* Rename/Invert {{dontTimeout}} flag from {{keyspace.apply}} to {{isDroppable}}
* Make hint delivery async (deferred) so it does not block on mutation stage on failure to
acquire MV lock
* Rename all {{Keyspace.apply}} methods that return {{CompletableFuture}} to {{applyFuture}}
so it is consistent with {{Mutation.applyFuture}} and more clear to avoid people calling it
and expect to be blocking, and also renamed previously {{Keyspace.applyBlocking}} to {{Keyspace.apply}}
since this retains the original nomenclature before CASSANDRA-10779 when it does not return
a {{CompletableFuture}}, to avoid confusing users (given there is an {{applyFuture}} method).
* On 3.x patch, write mutations to commit log when cdc is enabled.

[~brstgt] can you please double check and validate these changes?

Updated patch and resubmitted CI results below:

> Retry acquire MV lock on failure instead of throwing WTE on streaming
> ---------------------------------------------------------------------
>                 Key: CASSANDRA-12905
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>         Environment: centos 6.7 x86_64
>            Reporter: Nir Zilka
>            Assignee: Benjamin Roth
>            Priority: Critical
>             Fix For: 3.10
> Hello,
> I performed two upgrades to the current cluster (currently 15 nodes, 1 DC, private VLAN),
> first it was and repair worked flawlessly,
> second upgrade was to 3.0.9 (with upgradesstables) and also repair worked well,
> then i upgraded 2 weeks ago to 3.9 - and the repair problems started.
> there are several errors types from the system.log (different nodes) :
> - Sync failed between / and /
> - Streaming error occurred on session with peer Operation timed out -
received only 0 responses
> - Remote peer failed stream session
> - Session completed with the following error
> org.apache.cassandra.streaming.StreamException: Stream failed
> ----
> i use 3.9 default configuration with the cluster settings adjustments (3 seeds, GossipingPropertyFileSnitch).
> streaming_socket_timeout_in_ms is the default (86400000).
> i'm afraid from consistency problems while i'm not performing repair.
> Any ideas?
> Thanks,
> Nir.

This message was sent by Atlassian JIRA

View raw message