cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Rantil <jens.ran...@tink.se>
Subject Re: Ring connection timeouts with 2.2.6
Date Thu, 30 Jun 2016 07:20:20 GMT
Hi,

Could it be garbage collection occurring on nodes that are more heavily
loaded?

Cheers,
Jens

Den sön 26 juni 2016 05:22Mike Heffner <mike@librato.com> skrev:

> One thing to add, if we do a rolling restart of the ring the timeouts
> disappear entirely for several hours and performance returns to normal.
> It's as if something is leaking over time, but we haven't seen any
> noticeable change in heap.
>
> On Thu, Jun 23, 2016 at 10:38 AM, Mike Heffner <mike@librato.com> wrote:
>
>> Hi,
>>
>> We have a 12 node 2.2.6 ring running in AWS, single DC with RF=3, that is
>> sitting at <25% CPU, doing mostly writes, and not showing any particular
>> long GC times/pauses. By all observed metrics the ring is healthy and
>> performing well.
>>
>> However, we are noticing a pretty consistent number of connection
>> timeouts coming from the messaging service between various pairs of nodes
>> in the ring. The "Connection.TotalTimeouts" meter metric show 100k's of
>> timeouts per minute, usually between two pairs of nodes for several hours
>> at a time. It seems to occur for several hours at a time, then may stop or
>> move to other pairs of nodes in the ring. The metric
>> "Connection.SmallMessageDroppedTasks.<ip>" will also grow for one pair of
>> the nodes in the TotalTimeouts metric.
>>
>> Looking at the debug log typically shows a large number of messages like
>> the following on one of the nodes:
>>
>> StorageProxy.java:1033 - Skipped writing hint for /172.26.33.177 (ttl 0)
>>
>> We have cross node timeouts enabled, but ntp is running on all nodes and
>> no node appears to have time drift.
>>
>> The network appears to be fine between nodes, with iperf tests showing
>> that we have a lot of headroom.
>>
>> Any thoughts on what to look for? Can we increase thread count/pool sizes
>> for the messaging service?
>>
>> Thanks,
>>
>> Mike
>>
>> --
>>
>>   Mike Heffner <mike@librato.com>
>>   Librato, Inc.
>>
>>
>
>
> --
>
>   Mike Heffner <mike@librato.com>
>   Librato, Inc.
>
> --

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Mime
View raw message