Frens,

What consistency are you querying with? Could be you are simply receiving result from different nodes each time.

Jens


Skickat från Mailbox


On Wed, Mar 4, 2015 at 7:08 PM, Mikhail Strebkov <strebkov@gmail.com> wrote:

We have observed the same issue in our production Cassandra cluster (5 nodes in one DC). We use Cassandra 2.1.3 (I joined the list too late to realize we shouldn’t user 2.1.x yet) on Amazon machines (created from community AMI).

In addition to count variations with 5 to 10% we observe variations for the query “select * from table1 where time > '$fromDate' and time < '$toDate' allow filtering” results. We iterated through the results multiple times using official Java driver. We used that query for a huge data migration and were unpleasantly surprised that it is unreliable. In our case “nodetool repair” didn’t fix the issue.

So I echo Frens questions.

Thanks,
Mikhail




On Wed, Mar 4, 2015 at 3:55 AM, Rumph, Frens Jan <mail@frensjan.nl> wrote:

Hi,

Is it to be expected that select count(*) from ... and select distinct partition-key-columns from ... to yield inconsistent results between executions even though the table at hand isn't written to?

I have a table in a keyspace with replication_factor = 1 which is something like:

CREATE TABLE tbl (
    id frozen<id_type>,
    bucket bigint,
    offset int,
    value double,
    PRIMARY KEY ((id, bucket), offset)
)

The frozen udt is:

CREATE TYPE id_type (
    tags map<text, text>
);

When I do select count(*) from tbl several times the actual count varies with 5 to 10%. Also when performing select distinct id, bucket from tbl the results aren't consistent over several query executions. The table is not being written to at the time I performed the queries.

Is this to be expected? Or is this a bug? Is there a alternative method / workaround?

I'm using cqlsh 5.0.1 with Cassandra 2.1.2 on 64bit fedora 21 with Oracle Java 1.8.0_31.

Thanks in advance,
Frens Jan