incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Stevens <migh...@gmail.com>
Subject Re: CQL 3 returning duplicate keys
Date Tue, 04 Jun 2013 18:12:03 GMT
If this is a standard column family, not a CQL3 table, then using CQL3 will
not give you the results you expect.

>From cassandra-cli, let's set up some test data:

[default@unknown] create keyspace test;
[default@unknown] use test;
[default@test] create column family test;
[default@test] set test['a1']['c1'] = 'a1c1';
[default@test] set test['a1']['c2'] = 'a1c2';
[default@test] set test['a2']['c1'] = 'a2c1';
[default@test] set test['a2']['c2'] = 'a2c2';

Two rows with two columns each, right?  Not as far as CQL3 is concerned:

cqlsh> use test;
cqlsh:test> select * from test;

 key | column1 | value
-----+---------+--------
  a2 |    0xc1 | 0xa2c1
  a2 |    0xc2 | 0xa2c2
  a1 |    0xc1 | 0xa1c1
  a1 |    0xc2 | 0xa1c2

Basically for CQL3, without the additional metadata and enforcement that is
established by having created the column family as a CQL3 table, CQL will
treat each key/column pair as a separate row for CQL purposes.  This is
most likely at least in part due to the fact that CQL3 tables *cannot have
arbitrary columns *like standard column families can.  It wouldn't know
what columns are available for display.  This also exposes some of the
underlying structure behind CQL3 tables.

CQL 3 is not reverse compatible with CQL 2 for most things.  If you cannot
migrate your data to a CQL3 table.

The equivalent structure in CQL3 tables

cqlsh:test> create table test3 (key text PRIMARY KEY, c1 text, c2 text);
cqlsh:test> INSERT INTO test3(key, c1, c2) VALUES ('a1', 'a1c1', 'a1c2');
cqlsh:test> INSERT INTO test3(key, c1, c2) VALUES ('a2', 'a2c1', 'a2c2');
cqlsh:test> select * from test3;

 key | c1   | c2
-----+------+------
  a2 | a2c1 | a2c2
  a1 | a1c1 | a1c2

This comes with many important restrictions, one of which as mentioned is
that you cannot have arbitrary columns in a CQL3 table, just like you
cannot in a traditional relational database.  Likewise you cannot use
traditional approaches to populating data into a CQL3 table:

[default@test] get test3['a1'];
test3 not found in current keyspace.
[default@test] set test3['a3']['c1'] = 'a3c1';
test3 not found in current keyspace.
[default@test] describe test3;
WARNING: CQL3 tables are intentionally omitted from 'describe' output.




On Tue, Jun 4, 2013 at 12:56 PM, ekaqu something <ekaqu1028@gmail.com>wrote:

> I run a 1.1 cluster and currently testing out a 1.2 cluster.  I have
> noticed that with 1.2 it switched to CQL3 which is acting differently than
> I would expect.  When I do "select key from \"cf\";" I get many many
> duplicate keys.  When I did the same with CQL 2 I only get the keys
> defined.  This seems to also be the case for count(*), in cql2 it would
> return the number of keys i have, in 3 it returns way more than i really
> have.
>
> $ cqlsh `hostname` <<EOF
> use keyspace;
> select count(*) from "cf";
> EOF
>
>
>  count
> -------
>  10000
>
> Default LIMIT of 10000 was used. Specify your own LIMIT clause to get more
> results.
>
> $ cqlsh `hostname` -3 <<EOF
> use keyspace;
> select count(*) from "cf";
> EOF
>
>
>  count
> -------
>  10000
>
> Default LIMIT of 10000 was used. Specify your own LIMIT clause to get more
> results.
>
>
> $ cqlsh `hostname` -2 <<EOF
> use keyspace;
> select count(*) from cf;
> EOF
>
>
>  count
> -------
>   1934
>
> 1934 rows have really been inserted. Is there something up with cql3 or is
> there something else going on?
>
> Thanks for your time reading this email.
>

Mime
View raw message