Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3E6E2105B2 for ; Tue, 1 Apr 2014 15:13:47 +0000 (UTC) Received: (qmail 46032 invoked by uid 500); 1 Apr 2014 15:13:44 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 45794 invoked by uid 500); 1 Apr 2014 15:13:41 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 45783 invoked by uid 99); 1 Apr 2014 15:13:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Apr 2014 15:13:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sylvain@datastax.com designates 209.85.220.42 as permitted sender) Received: from [209.85.220.42] (HELO mail-pa0-f42.google.com) (209.85.220.42) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Apr 2014 15:13:35 +0000 Received: by mail-pa0-f42.google.com with SMTP id fb1so10031103pad.29 for ; Tue, 01 Apr 2014 08:13:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=A3vzZuT/+9pstHVFLvtjeEmYqoD/MJ74UMoabah9dv4=; b=cTHtNlBKymqa1XPmQgJJ21yp+I9YfEhm2UulE97dIjKhsXoct0M783Oh6WyfGrjaHb DR5FHtLuwe+BGWJMJ5NCHhhndLbwfbyMl9f3EQ/O1sdcRKsF5FszHyp0JJyvNtvhNmFx QJSLeBVcb36WY0YKkVrvCDG1OjgwJwkCTSLbRZL+XcVplkraEisvEqp4Uvga6yinD5g3 hvyHw5WmCg5Dl6A4dXzzADs4Jf0EAb6UbpOTx2MIzhtk9ryo271MVagyryVI2cq/0szd 3nlWn0stc4umi2KqYzN1sb2cLOezmqcvaCShMtUSB6dBXctphFctoge5TgPRhaT4mFpp TGaw== X-Gm-Message-State: ALoCoQm/OGGAKXJE4eIMTelIxJS6hUnamQG+TNKyw82B7ZIcTMIlx9/aaWRCqqIZ+ro8JfUAd6wY MIME-Version: 1.0 X-Received: by 10.68.229.68 with SMTP id so4mr15666281pbc.110.1396365192482; Tue, 01 Apr 2014 08:13:12 -0700 (PDT) Received: by 10.68.39.136 with HTTP; Tue, 1 Apr 2014 08:13:12 -0700 (PDT) In-Reply-To: References: Date: Tue, 1 Apr 2014 17:13:12 +0200 Message-ID: Subject: Re: Dead node appearing in datastax driver From: Sylvain Lebresne To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=047d7b162fd5466a9504f5fc9bde X-Virus-Checked: Checked by ClamAV on apache.org --047d7b162fd5466a9504f5fc9bde Content-Type: text/plain; charset=ISO-8859-1 What does "Did that" mean? Does that means "I upgraded to 2.0.6", or does that mean "I manually removed entries from System.peers". If the latter, I'd need more info on what you did exactly, what your peers table looked like before and how they look like now: there is no reason deleting the peers entries for hosts that at not part of the cluster anymore would have anything to do with write latency (but if say you've removed wrong entries, that might have make the driver think some live host had been removed and if the drivers has less nodes to use to dispatch queries, that might impact latency I suppose -- at least that's the only related thing I can think of). -- Sylvain On Tue, Apr 1, 2014 at 2:44 PM, Apoorva Gaurav wrote: > Did that and I actually see a significant reduction in write latency. > > > On Tue, Apr 1, 2014 at 5:35 PM, Sylvain Lebresne wrote: > >> On Tue, Apr 1, 2014 at 1:49 PM, Apoorva Gaurav > > wrote: >> >>> Hello Sylvian, >>> >>> Queried system.peers on three live nodes and host4 is appearing on two >>> of these. >>> >> >> That's why the driver thinks they are still there. You're most probably >> running into https://issues.apache.org/jira/browse/CASSANDRA-6053 since >> you are on C* 2.0.4. As said, this is relatively harmless, but you should >> think about upgrading to 2.0.6 to fix it in the future (you could manually >> remove the bad entries in System.peers in the meantime if you want, they >> are really just leftover that shouldn't be here). >> >> -- >> Sylvain >> >> >>> >>> On Tue, Apr 1, 2014 at 5:06 PM, Sylvain Lebresne wrote: >>> >>>> On Tue, Apr 1, 2014 at 12:50 PM, Apoorva Gaurav < >>>> apoorva.gaurav@myntra.com> wrote: >>>> >>>>> Hello All, >>>>> >>>>> We had a 4 node cassandra 2.0.4 cluster ( lets call them host1, >>>>> host2, host3 and host4), out of which we've removed one node (host4) using >>>>> nodetool removenode command. Now using nodetool status or nodetool ring we >>>>> no longer see host4. It's also not appearing in Datastax opscenter. But its >>>>> intermittently appearing in Metadata.getAllHosts() while connecting using >>>>> datastax driver 1.0.4. >>>>> >>>>> Couple of questions :- >>>>> -How is it appearing. >>>>> >>>> >>>> Not sure. Can you try querying the peers system table on each of your >>>> nodes (with cqlsh: SELECT * FROM system.peers) and see if the host4 is >>>> still mentioned somewhere? >>>> >>>> >>>>> -Can this have impact on read / write performance of client. >>>>> >>>> >>>> No. If the host doesn't exists, the driver might try to reconnect to it >>>> at times, but since it won't be able to, it won't try to use it for reads >>>> and writes. That does mean you might have a reconnection task running with >>>> some regularity, but 1) it's not on the write/read path of queries and 2) >>>> provided you've left the default reconnection policy, this will happen once >>>> every 10 minutes and will be pretty cheap so that it will consume an >>>> completely negligible amount of ressources. That doesn't mean I'm not >>>> interested tracking down why that happens in the first place though. >>>> >>>> -- >>>> Sylvain >>>> >>>> >>>> >>>>> >>>>> Code which we are using to connect is >>>>> >>>>> public void connect() { >>>>> >>>>> PoolingOptions poolingOptions = new PoolingOptions(); >>>>> >>>>> cluster = Cluster.builder() >>>>> >>>>> .addContactPoints(inetAddresses.toArray(newString[]{})) >>>>> >>>>> .withLoadBalancingPolicy(new RoundRobinPolicy()) >>>>> >>>>> .withPoolingOptions(poolingOptions) >>>>> >>>>> .withPort(port) >>>>> >>>>> .withCredentials(username, password) >>>>> >>>>> .build(); >>>>> >>>>> Metadata metadata = cluster.getMetadata(); >>>>> >>>>> System.out.printf("Connected to cluster: %s\n", >>>>> metadata.getClusterName()); >>>>> >>>>> for (Host host : metadata.getAllHosts()) { >>>>> >>>>> System.out.printf("Datacenter: %s; Host: %s; Rack: %s\n", >>>>> host.getDatacenter(), host.getAddress(), host.getRack()); >>>>> >>>>> } >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks & Regards, >>>>> Apoorva >>>>> >>>> >>>> >>> >>> >>> -- >>> Thanks & Regards, >>> Apoorva >>> >> >> > > > -- > Thanks & Regards, > Apoorva > --047d7b162fd5466a9504f5fc9bde Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
What does "Did that" mean? Does that means "= ;I upgraded to 2.0.6", or does that mean "I manually removed entr= ies from System.peers". If the latter, I'd need more info on what = you did exactly, what your peers table looked like before and how they look= like now: there is no reason deleting the peers entries for hosts that at = not part of the cluster anymore would have anything to do with write latenc= y (but if say you've removed wrong entries, that might have make the dr= iver think some live host had been removed and if the drivers has less node= s to use to dispatch queries, that might impact latency I suppose -- at lea= st that's the only related thing I can think of).

--
Sylvain
<= br>
On Tue, Apr 1, 2014 at 2:44 PM, Apoorva G= aurav <apoorva.gaurav@myntra.com> wrote:
Did that and I actually see= a significant reduction in write latency.


On Tue, Apr 1= , 2014 at 5:35 PM, Sylvain Lebresne <sylvain@datastax.com> wrote:
=
On Tue, Apr 1, 2014 at 1:49 PM, Apoorva Gau= rav <apoorva.gaurav@myntra.com> wrote:
Hello Sylvian,

Queried= system.peers on three live nodes and host4 is appearing on two of these.

That's why the driver thin= ks they are still there. You're most probably running into=A0ht= tps://issues.apache.org/jira/browse/CASSANDRA-6053 since you are on C* = 2.0.4. As said, this is relatively harmless, but you should think about upg= rading to 2.0.6 to fix it in the future (you could manually remove the bad = entries in System.peers in the meantime if you want, they are really just l= eftover that shouldn't be here).
=A0
--
Sylvain



On = Tue, Apr 1, 2014 at 5:06 PM, Sylvain Lebresne <sylvain@datastax.com= > wrote:
On Tue, Apr 1, 2014 at 12:50 PM, Apoorva Gaurav <= apoorva.gaur= av@myntra.com> wrote:
Hello All,

We had a 4 = node cassandra 2.0.4 cluster=A0=A0( lets call them host1, host2, host3 and = host4), out of which we've removed=A0one node (host4)=A0using nodetool = removenode command. Now using nodetool status or nodetool ring we no longer= see host4. It's also not appearing in Datastax opscenter. But its inte= rmittently appearing in Metadata.getAllHosts() while connecting using datas= tax driver 1.0.4.=A0

Couple of questions :-
-How is it appearing.<= /div>

Not sure. Can you try que= rying the peers system table on each of your nodes (with cqlsh: SELECT * FR= OM system.peers) and see if the host4 is still mentioned somewhere?
=A0
-Can this have impact = on read / write performance of client.

No. If the host doesn't exists, the driver might t= ry to reconnect to it at times, but since it won't be able to, it won&#= 39;t try to use it for reads and writes. That does mean you might have a re= connection task running with some regularity, but 1) it's not on the wr= ite/read path of queries and 2) provided you've left the default reconn= ection policy, this will happen once every 10 minutes and will be pretty ch= eap so that it will consume an completely negligible amount of ressources. = That doesn't mean I'm not interested tracking down why that happens= in the first place though.

--
Sylvain

=A0<= /div>

Code which we are using to connect is

=A0 =A0=A0public void connect() {

=A0 =A0 =A0 =A0 PoolingOptions poolingOptions =3D new Pooli= ngOptions();

=A0 =A0 =A0 =A0 cluster =3D Cluster.builder()

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 .addContactPoints(inetAddresses.toArray(new String[]{}))

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 .withLoadBalancingPolicy(new RoundRobinPolicy())

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 .withPoolingOptions(poolingOptions)

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 .withPort(port)

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 .withCredentials(username, = password)

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 .build();

=A0 =A0 =A0 =A0 Metadata metadata =3D cluster.getMetadata()= ;

=A0 =A0 =A0 =A0 System.out.printf("Connected to = cluster: %s\n", metadata.getClusterName());

=A0 =A0 =A0 =A0 for (Host host : metadata.getAllHosts()) {<= /p>

=A0 =A0 =A0 =A0 =A0 =A0 System.out.printf("Datac= enter: %s; Host: %s; Rack: %s\n", host.getDatacenter(), host.ge= tAddress(), host.getRack());

=A0 =A0 =A0 =A0 }

=A0 =A0 }



--
= Thanks & Regards,
Apoorva




--
Thanks &= Regards,
Apoorva




--
Thanks &= Regards,
Apoorva

--047d7b162fd5466a9504f5fc9bde--