cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: Live upgrade 2.0 to 2.1 temporarily increases GC time causing timeouts and unavailability
Date Fri, 19 Feb 2016 21:46:35 GMT
FYI, my observations were with native, not thrift.


*.......*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Fri, Feb 19, 2016 at 10:12 AM, Sotirios Delimanolis <sotodel_89@yahoo.com
> wrote:

> Does your cluster contain 24+ nodes or fewer?
>
> We did the same upgrade on a smaller cluster of 5 nodes and we didn't see
> this behavior. On the 24 node cluster, the timeouts only took effect once
> ~5-6-7+ nodes had been upgraded.
>
> We're doing some more upgrades next week, trying different deployment
> plans. I'll report back with the results.
>
> Thanks for the reply (we absolutely want to move to CQL)
>
>
> On Friday, February 19, 2016 1:10 AM, Alain RODRIGUEZ <arodrime@gmail.com>
> wrote:
>
>
> I performed this exact update a few days ago, excepted clients were using
> native protocol and it wen smoothly. So I think this might be thrift
> related. No idea what is producing this though, just wanted to give the
> info fwiw.
>
> As a side note, unrelated to the issue, performances using native are a
> lot better than thrift starting in C* 2.1. Drivers using native are also
> more modern allowing you to do very interesting stuff. Updating to native
> now that you are using 2.1 is something you might want to do soon enough
> :-).
>
> C*heers,
> -----------------
> Alain Rodriguez
> France
>
> The Last Pickle
> http://www.thelastpickle.com
>
> 2016-02-19 3:07 GMT+01:00 Sotirios Delimanolis <sotodel_89@yahoo.com>:
>
> We have a Cassandra cluster with 24 nodes. These nodes were running
> 2.0.16.
>
> While the nodes are in the ring and handling queries, we perform the
> upgrade to 2.1.12 as follows (more or less) one node at a time:
>
>
>    1. Stop the Cassandra process
>    2. Deploy jars, scripts, binaries, etc.
>    3. Start the Cassandra process
>
>
> A few nodes into the upgrade, we start noticing that the majority of
> queries (mostly through Thrift) time out or report unavailable. Looking at
> system information, Cassandra GC time goes through the roof, which is what
> we assume causes the time outs.
>
> Once all nodes are upgraded, the cluster stabilizes and no more (barely
> any) time outs occur.
>
> What could explain this? Does it have anything to do with how a 2.0
> communicates with a 2.1?
>
> Our Cassandra consumers haven't changed.
>
>
>
>
>
>
>
>
>

Mime
View raw message