incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Stevens <migh...@gmail.com>
Subject Re: CQL 3 returning duplicate keys
Date Wed, 05 Jun 2013 12:18:58 GMT
I mentioned a few limitations, so I'm not sure which you refer to.

As for not being able to access a CQL3 column family via traditional
approaches, beyond the example I gave above (where cassandra-cli claims it
does not recognize the column family), here is an article that mentions it:
http://www.datastax.com/dev/blog/whats-new-in-cql-3-0

As for not being able to insert dynamic columns, here is what happens if
you try:

cqlsh:test> insert into test3(key,c1,c2,newcol) values
> ('a3','a3c1','a3c2','a3newcol');
> Bad Request: Unknown identifier newcol


This is probably alarming, but don't fret, there is an alternative to
dynamic columns, and that's the new support for CQL3 collections (see
http://www.datastax.com/dev/blog/cql3_collections).  You have access to
sets, lists, and maps, as column types, which can be very useful.  Do note
that you should be careful to limit the size of a given collection because
collections are read in their entirety in order to access a single element
of the collection (see that article for more details).  Also, the
traditional Thrift / column family approach is not deprecated, CQL3 is just
an alternative (and noncompatible) approach.  If you have a data model
that's working for you, stick with Thrift / CQL2.

See "Mixing static and dynamic" at
http://www.datastax.com/dev/blog/thrift-to-cql3

As for standard column families representing as one row per key/column
pair, you can read more about that here:
http://www.datastax.com/dev/blog/thrift-to-cql3 - this is also in the
"Mixing static and dynamic" section, a little farther down.



On Tue, Jun 4, 2013 at 3:00 PM, Shahab Yunus <shahab.yunus@gmail.com> wrote:

> Thanks Eric for the detailed explanation but can you point to a source or
> document for this restriction in CQL3 tables? Doesn't it take away the main
> feature of the NoSQL store? Or am I am missing something obvious here?
>
> Regards,
> Shahab
>
>
> On Tue, Jun 4, 2013 at 2:12 PM, Eric Stevens <mightye@gmail.com> wrote:
>
>> If this is a standard column family, not a CQL3 table, then using CQL3
>> will not give you the results you expect.
>>
>> From cassandra-cli, let's set up some test data:
>>
>> [default@unknown] create keyspace test;
>> [default@unknown] use test;
>> [default@test] create column family test;
>> [default@test] set test['a1']['c1'] = 'a1c1';
>> [default@test] set test['a1']['c2'] = 'a1c2';
>> [default@test] set test['a2']['c1'] = 'a2c1';
>> [default@test] set test['a2']['c2'] = 'a2c2';
>>
>> Two rows with two columns each, right?  Not as far as CQL3 is concerned:
>>
>> cqlsh> use test;
>> cqlsh:test> select * from test;
>>
>>  key | column1 | value
>> -----+---------+--------
>>   a2 |    0xc1 | 0xa2c1
>>   a2 |    0xc2 | 0xa2c2
>>   a1 |    0xc1 | 0xa1c1
>>   a1 |    0xc2 | 0xa1c2
>>
>> Basically for CQL3, without the additional metadata and enforcement that
>> is established by having created the column family as a CQL3 table, CQL
>> will treat each key/column pair as a separate row for CQL purposes.  This
>> is most likely at least in part due to the fact that CQL3 tables *cannot
>> have arbitrary columns *like standard column families can.  It wouldn't
>> know what columns are available for display.  This also exposes some of the
>> underlying structure behind CQL3 tables.
>>
>> CQL 3 is not reverse compatible with CQL 2 for most things.  If you
>> cannot migrate your data to a CQL3 table.
>>
>> The equivalent structure in CQL3 tables
>>
>> cqlsh:test> create table test3 (key text PRIMARY KEY, c1 text, c2 text);
>> cqlsh:test> INSERT INTO test3(key, c1, c2) VALUES ('a1', 'a1c1', 'a1c2');
>> cqlsh:test> INSERT INTO test3(key, c1, c2) VALUES ('a2', 'a2c1', 'a2c2');
>> cqlsh:test> select * from test3;
>>
>>  key | c1   | c2
>> -----+------+------
>>   a2 | a2c1 | a2c2
>>   a1 | a1c1 | a1c2
>>
>> This comes with many important restrictions, one of which as mentioned is
>> that you cannot have arbitrary columns in a CQL3 table, just like you
>> cannot in a traditional relational database.  Likewise you cannot use
>> traditional approaches to populating data into a CQL3 table:
>>
>> [default@test] get test3['a1'];
>> test3 not found in current keyspace.
>> [default@test] set test3['a3']['c1'] = 'a3c1';
>> test3 not found in current keyspace.
>> [default@test] describe test3;
>> WARNING: CQL3 tables are intentionally omitted from 'describe' output.
>>
>>
>>
>>
>> On Tue, Jun 4, 2013 at 12:56 PM, ekaqu something <ekaqu1028@gmail.com>wrote:
>>
>>> I run a 1.1 cluster and currently testing out a 1.2 cluster.  I have
>>> noticed that with 1.2 it switched to CQL3 which is acting differently than
>>> I would expect.  When I do "select key from \"cf\";" I get many many
>>> duplicate keys.  When I did the same with CQL 2 I only get the keys
>>> defined.  This seems to also be the case for count(*), in cql2 it would
>>> return the number of keys i have, in 3 it returns way more than i really
>>> have.
>>>
>>> $ cqlsh `hostname` <<EOF
>>> use keyspace;
>>> select count(*) from "cf";
>>> EOF
>>>
>>>
>>>  count
>>> -------
>>>  10000
>>>
>>> Default LIMIT of 10000 was used. Specify your own LIMIT clause to get
>>> more results.
>>>
>>> $ cqlsh `hostname` -3 <<EOF
>>> use keyspace;
>>> select count(*) from "cf";
>>> EOF
>>>
>>>
>>>  count
>>> -------
>>>  10000
>>>
>>> Default LIMIT of 10000 was used. Specify your own LIMIT clause to get
>>> more results.
>>>
>>>
>>> $ cqlsh `hostname` -2 <<EOF
>>> use keyspace;
>>> select count(*) from cf;
>>> EOF
>>>
>>>
>>>  count
>>> -------
>>>   1934
>>>
>>> 1934 rows have really been inserted. Is there something up with cql3 or
>>> is there something else going on?
>>>
>>> Thanks for your time reading this email.
>>>
>>
>>
>

Mime
View raw message