Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9ED9B200C39 for ; Thu, 16 Mar 2017 18:19:23 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9D7A4160B78; Thu, 16 Mar 2017 17:19:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9CBC4160B72 for ; Thu, 16 Mar 2017 18:19:22 +0100 (CET) Received: (qmail 9200 invoked by uid 500); 16 Mar 2017 17:19:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 9190 invoked by uid 99); 16 Mar 2017 17:19:21 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Mar 2017 17:19:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 9D071C1201 for ; Thu, 16 Mar 2017 17:19:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.629 X-Spam-Level: ** X-Spam-Status: No, score=2.629 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Oj6t4QltYbl5 for ; Thu, 16 Mar 2017 17:19:19 +0000 (UTC) Received: from mail-yw0-f180.google.com (mail-yw0-f180.google.com [209.85.161.180]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 5B3445F645 for ; Thu, 16 Mar 2017 17:19:18 +0000 (UTC) Received: by mail-yw0-f180.google.com with SMTP id o4so37508900ywd.3 for ; Thu, 16 Mar 2017 10:19:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=lDGXQoQ4AuOPrCy3NV46UgLRFADzKXrFokeWkcAikuA=; b=FSTsPzC0Zyb7pEsisg4ct/4i/bwaNbFI/TXQW2q3pmi83klu5zCF0b2ELstPBqxuwh YS57m0SlwjZ3ZetCCBJ72noLLr65Q2DTPF3ouFNJEUwxJNZVEVOKtpI9qx9hQOuteLna 0x3RnxbLZd+NsdPVHbA5faELcvcBHvc/5mIxCXwEvu65h9oBFuw+bIa0zspDfmYufYQ9 ZfpQf8vA3JxVZkX2JhwntOQa7PnQ1gPWrEE7mRCYpil/v4u5EdDIKd/32/IZFhOe4T0k wI3u+LuZo+TXvYf1xN+dxJsEUmUXPMtH1VZ6yqmTg9kHKpClnChPzvIiZlNJK5qTjl3+ 49Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=lDGXQoQ4AuOPrCy3NV46UgLRFADzKXrFokeWkcAikuA=; b=TaMjQ1DvjtNNC8RU25LO5wYLVGSVbgRQ6nbtykIgi8ylxCVUl6RuAMF2sEcpgCJHp3 LpLUkgZ72cIFcgBBtfjyoE/7cd1KZAH2JPHAj+ED1g+g8ZUfaHI9UbT0BfMo5uE6itie gmSDCQmqd9MQsywGICsPRKsyFaCVqy73042GyS0GpFHK3HMVTI+CAmxGo/aKU2sWGBhm btthSyfFSDSvgSQoCympP8JePQOROBREBsURXbQ63zIwTGxhGZ8Y+UDEcJ7jyX/ovVLo RWXdnuRGmf/n21dt7fHfnZD0vGTpqr+yQ1+Fm98M5SBEjVhleA/NgKl/yTxiEs25GwmS MDCw== X-Gm-Message-State: AFeK/H2zjIAEaIs0Np+KkG1OXWeoX7yQN1lTwk1ZucrMF4E/i+6K0rGnHY1TkjD1KlZwHUkbx4qZypE7HUvz8g== X-Received: by 10.13.205.131 with SMTP id p125mr8625685ywd.249.1489684756196; Thu, 16 Mar 2017 10:19:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.225.193 with HTTP; Thu, 16 Mar 2017 10:19:15 -0700 (PDT) Received: by 10.13.225.193 with HTTP; Thu, 16 Mar 2017 10:19:15 -0700 (PDT) In-Reply-To: References: From: srinivasarao daruna Date: Thu, 16 Mar 2017 13:19:15 -0400 Message-ID: Subject: Re: Issue with Cassandra consistency in results To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a114e45c0b8b639054adc4217 archived-at: Thu, 16 Mar 2017 17:19:23 -0000 --001a114e45c0b8b639054adc4217 Content-Type: text/plain; charset=UTF-8 Would switching to select partition_key instead of select count(*) help me any way ? I know that, Logically they both are same.. but just asking out of desperation. Is it worth a shot? On Mar 16, 2017 1:09 PM, "Ryan Svihla" wrote: Replication factor is 3, and write consistency is ONE and read consistency is QUORUM. That combination is not gonna work well: *Write succeeds to NODE A but fails on node B,C* *Read goes to NODE B, C* If you can tolerate some temporary inaccuracy you can use QUORUM but may still have the situation where Write succeeds on node A a timestamp 1, B succeeds at timestamp 2 Read succeeds on node B and C at timestamp 1 If you need fully race condition free counts I'm afraid you need to use SERIAL or LOCAL_SERIAL (for in DC only accuracy) On Thu, Mar 16, 2017 at 1:04 PM, srinivasarao daruna wrote: > Replication strategy is SimpleReplicationStrategy. > > Smith is : EC2 snitch. As we deployed cluster on EC2 instances. > > I was worried that CL=ALL have more read latency and read failures. But > won't rule out trying it. > > Should I switch select count (*) to select partition_key column? Would > that be of any help.? > > > Thank you > Regards > Srini > > On Mar 16, 2017 12:46 PM, "Arvydas Jonusonis" > wrote: > > What are your replication strategy and snitch settings? > > Have you tried doing a read at CL=ALL? If it's an actual inconsistency > issue (missing data), this should cause the correct results to be returned. > You'll need to run a repair to fix the inconsistencies. > > If all the data is actually there, you might have one or several nodes > that aren't identifying the correct replicas. > > Arvydas > > > > On Thu, Mar 16, 2017 at 5:31 PM, srinivasarao daruna < > sree.srinu38@gmail.com> wrote: > >> Hi Team, >> >> We are struggling with a problem related to cassandra counts, after >> backup and restore of the cluster. Aaron Morton has suggested to send this >> to user list, so some one of the list will be able to help me. >> >> We are have a rest api to talk to cassandra and one of our query which >> fetches count is creating problems for us. >> >> We have done backup and restore and copied all the data to new cluster. >> We have done nodetool refresh on the tables, and did the nodetool repair as >> well. >> >> However, one of our key API call is returning inconsistent results. The >> result count is 0 in the first call and giving the actual values for later >> calls. The query frequency is bit high and failure rate has also raised >> considerably. >> >> 1) The count query has partition keys in it. Didnt see any read timeout >> or any errors from api logs. >> >> 2) This is how our code of creating session looks. >> >> val poolingOptions = new PoolingOptions >> poolingOptions >> .setCoreConnectionsPerHost(HostDistance.LOCAL, 4) >> .setMaxConnectionsPerHost(HostDistance.LOCAL, 10) >> .setCoreConnectionsPerHost(HostDistance.REMOTE, 4) >> .setMaxConnectionsPerHost( HostDistance.REMOTE, 10) >> >> val builtCluster = clusterBuilder.withCredentials(username, password) >> .withPoolingOptions(poolingOptions) >> .build() >> val cassandraSession = builtCluster.get.connect() >> >> val preparedStatement = cassandraSession.prepare(state >> ment).setConsistencyLevel(ConsistencyLevel.QUORUM) >> cassandraSession.execute(preparedStatement.bind(args :_*)) >> >> Query: SELECT count(*) FROM table_name WHERE parition_column=? AND >> text_column_of_clustering_key=? AND date_column_of_clustering_key<=? AND >> date_column_of_clustering_key>=? >> >> 3) Cluster configuration: >> >> 6 Machines: 3 seeds, we are using apache cassandra 3.9 version. Each >> machine is equipped with 16 Cores and 64 GB Ram. >> >> Replication factor is 3, and write consistency is ONE and read >> consistency is QUORUM. >> >> 4) cassandra is never down on any machine >> >> 5) Using cassandra-driver-core artifact with 3.1.1 version in the api. >> >> 6) nodetool tpstats shows no read failures, and no other failures. >> >> 7) Do not see any other issues from system.log of cassandra. We just see >> few warnings as below. >> >> Maximum memory usage reached (512.000MiB), cannot allocate chunk of >> 1.000MiB >> WARN [ScheduledTasks:1] 2017-03-14 14:58:37,141 QueryProcessor.java:103 >> - 88 prepared statements discarded in the last minute because cache limit >> reached (32 MB) >> The first api call returns 0 and the api calls later gives right values. >> >> Please let me know, if any other details needed. >> Could you please have a look at this issue once and kindly give me your >> inputs? This issue literally broke the confidence on Cassandra from our >> business team. >> >> Your inputs will be really helpful. >> >> Thank You, >> Regards, >> Srini >> > > > -- Thanks, Ryan Svihla --001a114e45c0b8b639054adc4217 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Would switching to select partition_key instead of s= elect count(*) help me any way ?

I know that, Logically they both are same.. but just asking =C2=A0out of= desperation. Is it worth a shot?


<= div class=3D"gmail_quote">On Mar 16, 2017 1:09 PM, "Ryan Svihla" = <rs@foundev.pro> wrote:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Repl= ication factor is 3, and write consistency is ONE and read consistency is Q= UORUM.

That combination is not gonna work we= ll:

Write succeeds to NODE A but fails on node = B,C

Read goes to NODE B, C
=

If you can tolerate some temporary inaccuracy you can u= se QUORUM but may still have the situation where

W= rite succeeds on node A a timestamp 1, B succeeds at timestamp 2
= Read succeeds on node B and C at timestamp 1=C2=A0

If you need fully race condition free counts I'm afraid you need to us= e SERIAL or LOCAL_SERIAL (for in DC only accuracy)

On = Thu, Mar 16, 2017 at 1:04 PM, srinivasarao daruna <sree.srinu38@gmail= .com> wrote:
Replication strategy is SimpleReplicationStrategy.

Smith is : EC2 snitch. As we deployed= cluster on EC2 instances.

I was worried that CL=3DALL have more read latency and read failures. Bu= t won't rule out trying it.

Should I switch select count (*) to select partition_key column? = Would that be of any help.?


Thank you=C2=A0
Rega= rds
Srini

On Mar 16, 2017 12:46 PM, "Arvydas Jonusonis"= ; <arvy= das.jonusonis@gmail.com> wrote:
What= are your replication strategy and snitch settings?

Have= you tried doing a read at CL=3DALL? If it's an actual inconsistency is= sue (missing data), this should cause the correct results to be returned. Y= ou'll need to run a repair to fix the inconsistencies.

If all the data is actually there, you might have one or several n= odes that aren't identifying the correct replicas.

Arvydas



On Thu, Mar 16, 2017 at 5:= 31 PM, srinivasarao daruna <sree.srinu38@gmail.com> wro= te:
Hi Team,=C2=A0

We are struggl= ing with a problem related to cassandra counts, after backup and restore of= the cluster. Aaron Morton has suggested to send this to user list, so some= one of the list will be able to help me.=C2=A0

We= are have a rest api to talk to cassandra and one of our query which fetche= s count is creating problems for us.

We have done = backup and restore and copied all the data to new cluster. We have done nod= etool refresh on the tables, and did the nodetool repair as well.

However, one of our key API call is returning inconsistent = results. The result count is 0 in the first call and giving the actual valu= es for later calls. The query frequency is bit high and failure rate has al= so raised considerably.

1) The count query has par= tition keys in it. Didnt see any read timeout or any errors from api logs.<= /div>

2) This is how our code of creating session looks.=

val poolingOptions =3D new PoolingOptions
=C2=A0 =C2=A0 poolingOptions
=C2=A0 =C2=A0 =C2=A0 .setCoreConn= ectionsPerHost(HostDistance.LOCAL, 4)
=C2=A0 =C2=A0 =C2=A0 .= setMaxConnectionsPerHost(HostDistance.LOCAL, 10)
=C2=A0 =C2= =A0 =C2=A0 .setCoreConnectionsPerHost(HostDistance.REMOTE, 4)
=C2=A0 =C2=A0 =C2=A0 .setMaxConnectionsPerHost( HostDistance.REMOTE, 10)<= /div>

val builtCluster =3D clusterBuilder.withCredential= s(username, password)
=C2=A0 =C2=A0 =C2=A0 .withPoolingOptio= ns(poolingOptions)
=C2=A0 =C2=A0 =C2=A0 .build()
v= al cassandraSession =3D builtCluster.get.connect()

val preparedStatement =3D cassandraSession.prepare(statement).setCons= istencyLevel(ConsistencyLevel.QUORUM)
cassandraSession.execu= te(preparedStatement.bind(args :_*))

Query: S= ELECT count(*) FROM table_name WHERE parition_column=3D? AND text_column_of= _clustering_key=3D? AND date_column_of_clustering_key<=3D? AND= date_column_of_clustering_key>=3D?

3) Clu= ster configuration:

6 Machines: 3 seeds,= we are using apache cassandra 3.9 version. Each machine is equipped with 1= 6 Cores and 64 GB Ram.

=C2=A0 =C2=A0 =C2=A0 =C2=A0= Replication factor is 3, and write consistency is ONE and read consistency= is QUORUM.

4) cassandra is never down on any mach= ine

5) Using cassandra-driver-core artifact with 3= .1.1 version in the api.

6) nodetool tpstats shows= no read failures, and no other failures.

7) Do no= t see any other issues from system.log of cassandra. We just see few warnin= gs as below.

Maximum memory usage reached (512.000= MiB), cannot allocate chunk of 1.000MiB
WARN =C2=A0[ScheduledTask= s:1] 2017-03-14 14:58:37,141 QueryProcessor.java:103 - 88 prepared statemen= ts discarded in the last minute because cache limit reached (32 MB)
The first api call returns 0 and the api calls later gives right values.=

Please let me know, if any other details needed.<= /div>
Could you please have a look at this issue once and kindly give m= e your inputs? This issue literally broke the confidence on Cassandra from = our business team.

Your inputs will be really help= ful.

Thank You,
Regards,=C2=A0
= Srini





--

Thanks,

=
Ryan Svihla


--001a114e45c0b8b639054adc4217--