cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7542) Reduce CAS contention
Date Thu, 24 Jul 2014 18:04:38 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073445#comment-14073445
] 

Benedict commented on CASSANDRA-7542:
-------------------------------------

I decided to have a quick crack at this to see if I could get an initial patch done for you
to take for a spin. It's available [here|https://github.com/belliottsmith/cassandra/tree/7542-cascontend]

I find on my local machine that it speeds up the paxos dtest by about 20%, which may be a
little conservative for improvement as it takes a little while for it to get an idea of how
long a paxos round takes. There are some improvements that should probably be made before
rolling it out generally, such as tracking latency for each token range independently, and
we should probably track the level of recent contention so that we can exponentially backoff
if we're too aggressive (this might permit us to be a little _more_ aggressive in the typical
case).

On my box, it's worth noting this patch brings the average wait time due to contention down
to around 15ms, instead of 50ms. Since this is more than a 20% decline, there is probably
some more tuning to be done besides to get improved throughput.

This only explores two of the potential ideas: reducing intra-node competition, and reducing
sleep interval. 

> Reduce CAS contention
> ---------------------
>
>                 Key: CASSANDRA-7542
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7542
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Benedict
>             Fix For: 2.0.10
>
>
> CAS updates on same CQL partition can lead to heavy contention inside C*. I am looking
for simple ways(no algorithmic changes) to reduce contention as the penalty of it is high
in terms of latency, specially for reads. 
> We can put some sort of synchronization on CQL partition at StorageProxy level. This
will reduce contention at least for all requests landing on one box for same partition. 
> Here is an example of why it will help:
> 1) Say 1 write and 2 read CAS requests for the same partition key is send to C* in parallel.

> 2) Since client is token-aware, it sends these 3 request to the same C* instance A. (Lets
assume that all 3 requests goto same instance A) 
> 3) In this C* instance A, all 3 CAS requests will contend with each other in Paxos. (This
is bad)
> To improve contention in 3), what I am proposing is to add a lock on partition key similar
to what we do in PaxosState.java to serialize these 3 requests. This will remove the contention
and improve performance as these 3 requests will not collide with each other.
> Another improvement we can do in client is to pick a deterministic live replica for a
given partition doing CAS.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message