cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Samuelsson <samuelsson.j...@gmail.com>
Subject Re: Nodes go down periodically
Date Tue, 23 Feb 2016 16:01:52 GMT
Hi,

Version is 2.0.17.
Yes, these are VMs in the cloud though I'm fairly certain they are on a LAN
rather than WAN. They are both in the same data centre physically. The
phi_convict_threshold is set to default. I'd rather find the root cause of
the problem than just hiding it by not convicting a node if it isn't
responding though. If pings are <2 ms without a single ping missed in
several days, I highly doubt that network is the reason for the downtime.

Best regards,
Joel

2016-02-23 16:39 GMT+01:00 <SEAN_R_DURITY@homedepot.com>:

> You didn’t mention version, but I saw this kind of thing very often in the
> 1.1 line. Often this is connected to network flakiness. Are these VMs? In
> the cloud? Connected over a WAN? You mention that ping seems fine. Take a
> look at the phi_convict_threshold in c assandra.yaml. You may need to
> increase it to reduce the UP/DOWN flapping behavior.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Joel Samuelsson [mailto:samuelsson.joel@gmail.com]
> *Sent:* Tuesday, February 23, 2016 9:41 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Nodes go down periodically
>
>
>
> Hi,
>
>
>
> Thanks for your reply.
>
>
>
> I have debug logging on and see no GC pauses that are that long. GC pauses
> are all well below 1s and 99 times out of 100 below 100ms.
>
> Do I need to enable GC log options to see the pauses?
>
> I see plenty of these lines:
> DEBUG [ScheduledTasks:1] 2016-02-22 10:43:02,891 GCInspector.java (line
> 118) GC for ParNew: 24 ms for 1 collections
>
> as well as a few CMS GC log lines.
>
>
>
> Best regards,
>
> Joel
>
>
>
> 2016-02-23 15:14 GMT+01:00 Hannu Kröger <hkroger@gmail.com>:
>
> Hi,
>
>
>
> Those are probably GC pauses. Memory tuning is probably needed. Check the
> parameters that you already have customised if they make sense.
>
>
>
> http://blog.mikiobraun.de/2010/08/cassandra-gc-tuning.html
>
>
>
> Hannu
>
>
>
>
>
> On 23 Feb 2016, at 16:08, Joel Samuelsson <samuelsson.joel@gmail.com>
> wrote:
>
>
>
> Our nodes go down periodically, around 1-2 times each day. Downtime is
> from <1 second to 30 or so seconds.
>
>
>
> INFO [GossipTasks:1] 2016-02-22 10:05:14,896 Gossiper.java (line 992)
> InetAddress /109.74.13.67 is now DOWN
>
>  INFO [RequestResponseStage:8844] 2016-02-22 10:05:38,331 Gossiper.java
> (line 978) InetAddress /109.74.13.67 is now UP
>
>
>
> I find nothing odd in the logs around the same time. I logged a ping with
> timestamp and checked during the same time and saw nothing weird (ping is
> less than 2ms at all times).
>
>
>
> Does anyone have any suggestions as to why this might happen?
>
>
>
> Best regards,
> Joel
>
>
>
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>

Mime
View raw message