Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D47F17E48 for ; Wed, 4 Mar 2015 18:08:12 +0000 (UTC) Received: (qmail 73155 invoked by uid 500); 4 Mar 2015 18:08:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 73111 invoked by uid 500); 4 Mar 2015 18:08:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 73101 invoked by uid 99); 4 Mar 2015 18:08:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Mar 2015 18:08:09 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of strebkov@gmail.com designates 209.85.192.46 as permitted sender) Received: from [209.85.192.46] (HELO mail-qg0-f46.google.com) (209.85.192.46) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Mar 2015 18:08:04 +0000 Received: by qgdq107 with SMTP id q107so976010qgd.7 for ; Wed, 04 Mar 2015 10:06:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:mime-version:message-id:in-reply-to:references:from:to:cc :subject:content-type; bh=fvH8lDEjsdTcWIRlXrpDDDCjAyhk9iZnOG/FsMUiNk4=; b=e+HEhzt5YRSvPsI3DBODbQAuI5IzdlgZ1RKOffchwL2oXfUm+a1JtXzu6lQTMBWLVY fFF2w8HKjMDvfO7oypfizU20Y93UhpZ40pxXHbNJmZ0PNxa+EKiod/3HH7mGEWd2JlAa Tmn6/DxH4DU9SQN4xa/p/FlpE7yg/cSx2FoQEZJid70nqzrdDjj/0opVkGIuHI9d7RpQ 0mKjLdqXUCMoXOOcXM8T9+9lOj/YstQWMp1TUB78ghvTvqVy8pXddtfsxRBRnXEPbcny GmIEel3PudXGJbg82fvkecZIhIKTtezaUIeg7QFjJKcfHzw4ccYhKcp6MzoMLtd3zMZC AOwA== X-Received: by 10.140.33.164 with SMTP id j33mr7330951qgj.10.1425492418311; Wed, 04 Mar 2015 10:06:58 -0800 (PST) Received: from hedwig-38.prd.orcali.com (ec2-54-85-253-136.compute-1.amazonaws.com. [54.85.253.136]) by mx.google.com with ESMTPSA id n77sm2591834qha.19.2015.03.04.10.06.57 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 04 Mar 2015 10:06:57 -0800 (PST) Date: Wed, 04 Mar 2015 10:06:57 -0800 (PST) X-Google-Original-Date: Wed, 04 Mar 2015 18:06:57 GMT MIME-Version: 1.0 X-Mailer: Nodemailer (0.5.0; +http://www.nodemailer.com/) Message-Id: <1425492417697.a19cfa28@Nodemailer> In-Reply-To: References: X-Orchestra-Oid: 383E1ED8-EA72-4EBA-892A-F08AD7994639 X-Orchestra-Sig: 9ed2cfd6476c39b67cee8167cdbe4ec88779ca0c X-Orchestra-Thrid: T2E0BE887-279F-4ECC-B86D-E73C7CB5B9E5_1494713763505369946 X-Orchestra-Thrid-Sig: 9fb22e025fe840eed5babf04a5d99ced2546df8f X-Orchestra-Account: 0e1019f6397526c8332469cecd7cc0b0e84f76c9 From: "Mikhail Strebkov" To: user@cassandra.apache.org Cc: user@cassandra.apache.org Subject: Re: Inconsistent count(*) and distinct results from Cassandra Content-Type: multipart/alternative; boundary="----Nodemailer-0.5.0-?=_1-1425492417871" X-Virus-Checked: Checked by ClamAV on apache.org ------Nodemailer-0.5.0-?=_1-1425492417871 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable We have observed the same issue in our production Cassandra cluster (5 = nodes in one DC). We use Cassandra 2.1.3 (I joined the list too late to = realize we shouldn=E2=80=99t user 2.1.x yet) on Amazon machines (created = from community AMI). In addition to count variations with 5 to 10% we observe variations for the= query =E2=80=9Cselect * from table1 where time > '$fromDate' and time < = '$toDate' allow filtering=E2=80=9D results. We iterated through the results= multiple times using official Java driver. We used that query for a huge = data migration and were unpleasantly surprised that it is unreliable. In = our case =E2=80=9Cnodetool repair=E2=80=9D didn=E2=80=99t fix the issue. So I echo Frens questions. Thanks, Mikhail On Wed, Mar 4, 2015 at 3:55 AM, Rumph, Frens Jan wrote: > Hi, > Is it to be expected that select count(*) from ... and select distinct > partition-key-columns from ... to yield inconsistent results between > executions even though the table at hand isn't written to=3F > I have a table in a keyspace with replication=5Ffactor =3D 1 which is = something > like: > CREATE TABLE tbl ( > id frozen, > bucket bigint, > offset int, > value double, > PRIMARY KEY ((id, bucket), offset) > ) > The frozen udt is: > CREATE TYPE id=5Ftype ( > tags map > ); > When I do select count(*) from tbl several times the actual count varies > with 5 to 10%. Also when performing select distinct id, bucket from tbl = the > results aren't consistent over several query executions. The table is = not > being written to at the time I performed the queries. > Is this to be expected=3F Or is this a bug=3F Is there a alternative = method / > workaround=3F > I'm using cqlsh 5.0.1 with Cassandra 2.1.2 on 64bit fedora 21 with = Oracle > Java 1.8.0=5F31. > Thanks in advance, > Frens Jan ------Nodemailer-0.5.0-?=_1-1425492417871 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
We have observed the same issue = in our production Cassandra cluster (5 nodes in one DC). We use Cassandra 2= .1.3 (I joined the list too late to realize we shouldn=E2=80=99t user 2.1.x= yet) on Amazon machines (created from community AMI).

In addition to count variations with 5 to 10% we observe variations = for the query =E2=80=9Cselect * from table1 where time > '$fromDate' and= time < '$toDate' allow filtering=E2=80=9D results. We iterated through = the results multiple times using official Java driver. We used that query = for a huge data migration and were unpleasantly surprised that it is = unreliable. In our case =E2=80=9Cnodetool repair=E2=80=9D didn=E2=80=99t = fix the issue.

So I echo Frens questions.

Thanks,
Mikhail




On Wed, Mar 4, 2015 at 3:55 AM,= Rumph, Frens Jan <mail@frensjan.nl> = wrote:

Hi,

Is it to be expected that select count(*) from ... and select distinct= partition-key-columns from ... to yield inconsistent results between = executions even though the table at hand isn't written to=3F

I have a table in a keyspace with replication=5Ffactor =3D 1 which is = something like:

CREATE TABLE tbl (
=C2=A0 =C2=A0 id frozen<id=5Ftype>,
=C2=A0 =C2=A0 bucket bigint,
=C2=A0 =C2=A0 offset int,
=C2=A0 =C2=A0 value double,
=C2=A0 =C2=A0 PRIMARY KEY ((id, bucket), offset)
)

The frozen udt is:

CREATE TYPE id=5Ftype (
=C2=A0 =C2=A0 tags map<text, text>
);

When I do select count(*) from tbl several times the actual count = varies with 5 to 10%. Also when performing select distinct id, bucket from = tbl the results aren't consistent over several query executions. The table = is not being written to at the time I performed the queries.

Is this to be expected=3F Or is this a bug=3F Is there a alternative = method / workaround=3F

I'm using cqlsh 5.0.1 with Cassandra 2.1.2 on 64bit fedora 21 with = Oracle Java 1.8.0=5F31.

Thanks in advance,
Frens Jan

------Nodemailer-0.5.0-?=_1-1425492417871--