incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: CQL 3 returning duplicate keys
Date Wed, 05 Jun 2013 12:25:00 GMT
Thanks Eric.  Yeah, I was asking about the second limitation (about dynamic
columns) and you have explained it well along with pointers to read further.

Regards,
Shahab


On Wed, Jun 5, 2013 at 8:18 AM, Eric Stevens <mightye@gmail.com> wrote:

> I mentioned a few limitations, so I'm not sure which you refer to.
>
> As for not being able to access a CQL3 column family via traditional
> approaches, beyond the example I gave above (where cassandra-cli claims it
> does not recognize the column family), here is an article that mentions it:
> http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
>
> As for not being able to insert dynamic columns, here is what happens if
> you try:
>
> cqlsh:test> insert into test3(key,c1,c2,newcol) values
>> ('a3','a3c1','a3c2','a3newcol');
>> Bad Request: Unknown identifier newcol
>
>
> This is probably alarming, but don't fret, there is an alternative to
> dynamic columns, and that's the new support for CQL3 collections (see
> http://www.datastax.com/dev/blog/cql3_collections).  You have access to
> sets, lists, and maps, as column types, which can be very useful.  Do note
> that you should be careful to limit the size of a given collection because
> collections are read in their entirety in order to access a single element
> of the collection (see that article for more details).  Also, the
> traditional Thrift / column family approach is not deprecated, CQL3 is just
> an alternative (and noncompatible) approach.  If you have a data model
> that's working for you, stick with Thrift / CQL2.
>
> See "Mixing static and dynamic" at
> http://www.datastax.com/dev/blog/thrift-to-cql3
>
> As for standard column families representing as one row per key/column
> pair, you can read more about that here:
> http://www.datastax.com/dev/blog/thrift-to-cql3 - this is also in the
> "Mixing static and dynamic" section, a little farther down.
>
>
>
> On Tue, Jun 4, 2013 at 3:00 PM, Shahab Yunus <shahab.yunus@gmail.com>wrote:
>
>> Thanks Eric for the detailed explanation but can you point to a source or
>> document for this restriction in CQL3 tables? Doesn't it take away the main
>> feature of the NoSQL store? Or am I am missing something obvious here?
>>
>> Regards,
>> Shahab
>>
>>
>> On Tue, Jun 4, 2013 at 2:12 PM, Eric Stevens <mightye@gmail.com> wrote:
>>
>>> If this is a standard column family, not a CQL3 table, then using CQL3
>>> will not give you the results you expect.
>>>
>>> From cassandra-cli, let's set up some test data:
>>>
>>> [default@unknown] create keyspace test;
>>> [default@unknown] use test;
>>> [default@test] create column family test;
>>> [default@test] set test['a1']['c1'] = 'a1c1';
>>> [default@test] set test['a1']['c2'] = 'a1c2';
>>> [default@test] set test['a2']['c1'] = 'a2c1';
>>> [default@test] set test['a2']['c2'] = 'a2c2';
>>>
>>> Two rows with two columns each, right?  Not as far as CQL3 is concerned:
>>>
>>> cqlsh> use test;
>>> cqlsh:test> select * from test;
>>>
>>>  key | column1 | value
>>> -----+---------+--------
>>>   a2 |    0xc1 | 0xa2c1
>>>   a2 |    0xc2 | 0xa2c2
>>>   a1 |    0xc1 | 0xa1c1
>>>   a1 |    0xc2 | 0xa1c2
>>>
>>> Basically for CQL3, without the additional metadata and enforcement that
>>> is established by having created the column family as a CQL3 table, CQL
>>> will treat each key/column pair as a separate row for CQL purposes.  This
>>> is most likely at least in part due to the fact that CQL3 tables *cannot
>>> have arbitrary columns *like standard column families can.  It wouldn't
>>> know what columns are available for display.  This also exposes some of the
>>> underlying structure behind CQL3 tables.
>>>
>>> CQL 3 is not reverse compatible with CQL 2 for most things.  If you
>>> cannot migrate your data to a CQL3 table.
>>>
>>> The equivalent structure in CQL3 tables
>>>
>>> cqlsh:test> create table test3 (key text PRIMARY KEY, c1 text, c2 text);
>>> cqlsh:test> INSERT INTO test3(key, c1, c2) VALUES ('a1', 'a1c1', 'a1c2');
>>> cqlsh:test> INSERT INTO test3(key, c1, c2) VALUES ('a2', 'a2c1', 'a2c2');
>>> cqlsh:test> select * from test3;
>>>
>>>  key | c1   | c2
>>> -----+------+------
>>>   a2 | a2c1 | a2c2
>>>   a1 | a1c1 | a1c2
>>>
>>> This comes with many important restrictions, one of which as mentioned
>>> is that you cannot have arbitrary columns in a CQL3 table, just like you
>>> cannot in a traditional relational database.  Likewise you cannot use
>>> traditional approaches to populating data into a CQL3 table:
>>>
>>> [default@test] get test3['a1'];
>>> test3 not found in current keyspace.
>>> [default@test] set test3['a3']['c1'] = 'a3c1';
>>> test3 not found in current keyspace.
>>> [default@test] describe test3;
>>> WARNING: CQL3 tables are intentionally omitted from 'describe' output.
>>>
>>>
>>>
>>>
>>> On Tue, Jun 4, 2013 at 12:56 PM, ekaqu something <ekaqu1028@gmail.com>wrote:
>>>
>>>> I run a 1.1 cluster and currently testing out a 1.2 cluster.  I have
>>>> noticed that with 1.2 it switched to CQL3 which is acting differently than
>>>> I would expect.  When I do "select key from \"cf\";" I get many many
>>>> duplicate keys.  When I did the same with CQL 2 I only get the keys
>>>> defined.  This seems to also be the case for count(*), in cql2 it would
>>>> return the number of keys i have, in 3 it returns way more than i really
>>>> have.
>>>>
>>>> $ cqlsh `hostname` <<EOF
>>>> use keyspace;
>>>> select count(*) from "cf";
>>>> EOF
>>>>
>>>>
>>>>  count
>>>> -------
>>>>  10000
>>>>
>>>> Default LIMIT of 10000 was used. Specify your own LIMIT clause to get
>>>> more results.
>>>>
>>>> $ cqlsh `hostname` -3 <<EOF
>>>> use keyspace;
>>>> select count(*) from "cf";
>>>> EOF
>>>>
>>>>
>>>>  count
>>>> -------
>>>>  10000
>>>>
>>>> Default LIMIT of 10000 was used. Specify your own LIMIT clause to get
>>>> more results.
>>>>
>>>>
>>>> $ cqlsh `hostname` -2 <<EOF
>>>> use keyspace;
>>>> select count(*) from cf;
>>>> EOF
>>>>
>>>>
>>>>  count
>>>> -------
>>>>   1934
>>>>
>>>> 1934 rows have really been inserted. Is there something up with cql3 or
>>>> is there something else going on?
>>>>
>>>> Thanks for your time reading this email.
>>>>
>>>
>>>
>>
>

Mime
View raw message