cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Esken (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread
Date Tue, 14 Mar 2017 13:42:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924135#comment-15924135
] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:42 PM:
----------------------------------------------------------------------

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its not null
    OK.  I thought it might be possible to set this to null, but even JConsole refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value so operators
have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been there for quite
some time hard coded, and in the long term I do not think it should be kept as-is. For example
it would better to purge on the number of actually DROPPABLE messages in the queue (or their
weight if you want to extend even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You can piggyback
on backlogExpirationActive to get the desired effects from the Java memory model. [...] I
wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like it as visibility
effects would not be obvious, unless explicitly documented. You are probably aware what Brian
Goetz says about piggybacking in his JCIP book. BTW: A more obvious usage is for me in status
fields, e.g. to make the results of a Future visible. I won't change it either, so marking
this as done.
- (x)    Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)    Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block which would
disable expiration infinitely. I was quite tired yesterday. I am swapping it back, TimeUnit
conversions never throw Exceptions, so it is safe. :-|
- (?)  This is not quite correct you can't count drainCount as dropped because some of the
drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in this patch,
as it will likely create even more conflicts. OTOH I have to touch some related methods anyhow.
I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its not null
    OK.  I thought it might be possible to set this to null, but even JConsole refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value so operators
have options without recompiling.
 I would like BACKLOG_PURGE_SIZE hard coded. It has been there for quite some time hard coded,
and in the long term I do not think it should be kept. Fox example it would better to purge
on the number of actually DROPPABLE messages in the queue (or their weight if you want to
extend even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You can piggyback
on backlogExpirationActive to get the desired effects from the Java memory model. [...] I
wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like it as visibility
effects would not be obvious, unless explicitly documented. You are probably aware what Brian
Goetz says about piggybacking in his JCIP book. BTW: A more obvious usage is for me in status
fields, e.g. to make the results of a Future visible. I won't change it either, so marking
this as done.
- (x)    Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)    Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block which would
disable expiration infinitely. I was quite tired yesterday. I am swapping it back, TimeUnit
conversions never throw Exceptions, so it is safe. :-|
- (?)  This is not quite correct you can't count drainCount as dropped because some of the
drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in this patch,
as it will likely create even more conflicts. OTOH I have to touch some related methods anyhow.
I will think about it.


> Expiration in OutboundTcpConnection can block the reader Thread
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-13265
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 1.8.0_112-b15)
> Linux 3.16
>            Reporter: Christian Esken
>            Assignee: Christian Esken
>             Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>         Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to communicate to
the other nodes. This can happen at any time, during peak load or low load. Restarting that
single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the situation and am
already developing a possible fix. Here is the analysis so far:
> - A Threaddump in this situation showed  324 Threads in the OutboundTcpConnection class
that want to lock the backlog queue for doing expiration.
> - A class histogram shows 262508 instances of OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain amount
of queued messages, it starts thrashing itself to death. Each of the Thread fully locks the
Queue for reading and writing by calling iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually writing
to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and fully lock
the Queue
> This means: Writing blocks the Queue for reading, and readers might even be starved which
makes the situation even worse.
> -----
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (100000 INSERT statements per second and more during peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message