cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8940) Inconsistent select count and select distinct
Date Wed, 22 Apr 2015 19:22:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507719#comment-14507719
] 

Benjamin Lerer commented on CASSANDRA-8940:
-------------------------------------------

[~frensjan] After installing Vagrant, VirtualBox, making sure that Cygwin add all the required
modules and going around a bug in Vagrant, I manage to have everything running. Unfortunatly,
I could not reproduce the problem on my machine :-(
I tried with all combinations of ids, buckets  and offsets that you suggested but without
success. The count was always the good one.

I tried with CCM just in case but it was also working fine.

I have really no ideas of where the problem might come from and why it does not appear on
my environment.

As you can reproduce it on your environment, it would be interesting if you could enable tracing
(http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2) and get me a trace of when the
count fail and when it succeed. It could help me to pinpoint where the problem might come
from.

Thanks a lot for having spent your time in creating all those the scripts. It worked very
well (outside of my windows issues).

> Inconsistent select count and select distinct
> ---------------------------------------------
>
>                 Key: CASSANDRA-8940
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8940
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 2.1.2
>            Reporter: Frens Jan Rumph
>            Assignee: Benjamin Lerer
>         Attachments: Vagrantfile, install_cassandra.sh, setup_hosts.sh
>
>
> When performing {{select count( * ) from ...}} I expect the results to be consistent
over multiple query executions if the table at hand is not written to / deleted from in the
mean time. However, in my set-up it is not. The counts returned vary considerable (several
percent). The same holds for {{select distinct partition-key-columns from ...}}.
> I have a table in a keyspace with replication_factor = 1 which is something like:
> {code}
> CREATE TABLE tbl (
>     id frozen<id_type>,
>     bucket bigint,
>     offset int,
>     value double,
>     PRIMARY KEY ((id, bucket), offset)
> )
> {code}
> The frozen udt is:
> {code}
> CREATE TYPE id_type (
>     tags map<text, text>
> );
> {code}
> The table contains around 35k rows (I'm not trying to be funny here ...). The consistency
level for the queries was ONE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message